From: Alan Maguire <alan.maguire@oracle.com>
To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
jolsa@kernel.org
Cc: quentin@isovalent.com, eddyz87@gmail.com, martin.lau@linux.dev,
song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com,
haoluo@google.com, masahiroy@kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH v4 bpf-next 00/17] Add kind layout, CRCs to BTF
Date: Tue, 14 Nov 2023 20:16:53 +0000 [thread overview]
Message-ID: <f546e2bf-982b-62cd-b2d4-88760d4d97d7@oracle.com> (raw)
In-Reply-To: <20231112124834.388735-1-alan.maguire@oracle.com>
On 12/11/2023 12:48, Alan Maguire wrote:
> Update struct btf_header to add a new "kind_layout" section containing
> a description of how to parse the BTF kinds known about at BTF
> encoding time. This provides the opportunity for tools that might
> not know all of these kinds - as is the case when older tools run
> on more newly-generated BTF - to still parse the BTF provided,
> even if it cannot all be used.
>
> Also add CRCs for the BTF and base BTF (if needed) from which it was
> created. CRCs provide a few useful features:
>
> - the base CRC allows us to explicitly identify when the split and
> base BTF are not matched
> - absence of a base BTF CRC can indicate that BTF is standalone;
> i.e. not defined relative to base BTF
>
> The former case can be used to explicitly reject mismatched
> module/kernel BTF rather than assuming it is matched until an
> unexpected type is encountered.
>
> The latter case is useful for modules that are not built as
> frequently as the kernel; in such cases, the module can be built
> standalone by specifying an empty BTF base:
>
> make BTF_BASE= M=path/2/module
>
> If CRCs are not present (as will be the case for pahole versions
> prior to the proposed v1.26 which will support CRC generation),
> standalone BTF can still be identified by a slower fallback
> method of examining BTF type ids to ensure that BTF is
> self-referential only.
>
> To ensure existing tooling can handle standalone BTF for kernel
> modules, we remap the type ids to start after the vmlinux
> BTF ids, to make it appear to be split BTF. This allows tools
> (and the kernel) that assume split BTF for modules to operate normally.
>
hi folks
I wanted to capture feedback received on the approach described here for
BTF module generation at my talk at LPC [1].
Stepping back, the aim is to provide a way to generate BTF for a module
such that it is somewhat resilient to minor changes in underlying BTF,
so it does not have to be rebuilt every time vmlinux is built. The
module references to vmlinux BTF ids are currently very brittle, and
even for the same kernel we get different vmlinux BTF ids if the BTF is
rebuilt. So the aim is to support a more robust method of module BTF
generation. Note that the approach described here is not needed for
modules that are built at the same time as the kernel, so it's unlikely
any in-tree modules will need this, but it will be useful for cases such
as where modules are delivered via a package and want to make use
of BTF such that it will not be invalidated.
Turning to the talk, the general consensus - I think - was that the
standalone BTF approach described in this series was problematic.
Consider kfuncs, if we have, for example, our own definition of a
structure in standalone module BTF, the BTF id of the local structure
will not match that of the core kernel, which has the potential to
confuse the verifier.
A similar problem exists for tracing; we would trace an sk_buff in
the module via the module's view of struct sk_buff, but we have no
guarantees that the module's view is still consistent with the vmlinux
representation (which actually allocated it).
Hopefully I've characterized this correctly; let me know if I missed
something here.
So we need some means to both remap BTF ids in the module BTF that refer
to the vmlinux BTF so they point at the right types, _and_ to check the
consistency of the representation of a vmlinux type between module BTF
build time and when it is loaded into the kernel.
With this in mind, I think a good way forward might be something like
the following:
For cases where we want more change-independent module BTF - which
is resilient to things like reshuffling of vmlinux BTF ids, and small
changes that don't invalidate structure use completely - we add
a "relocatable" option to the --btf_features list of features for pahole
encoding of module BTF.
This option would not be needed for modules built at the same time as
the kernel, since the BTF ids and the types they refer to are consistent.
When used however, it would tell BTF dedup in pahole to add reocation
information as well as generating usual split BTF at the time of module
BTF generation. This relocation information would consist of
descriptions of the BTF types that the module refers to in base BTF and
their dependents. By providing such descriptions, we can then reconcile
the views of types between module and kernel, or if such reconciliation
is impossible, we can refuse to use the BTF. The amount of information
needed for a module will need to be determined, but I'm hopeful in most
cases it would be a small subset of the type information
required for vmlinux as a whole.
The process of reconciling module and vmlinux BTF at module load time
would then be
1. Remap all the split BTF ids representing module-specific types
and functions to start at last_vmlinux_id + 1. Since the current
vmlinux may have a different number of types than the vmlinux
at time of encoding, this remapping is necessary.
2. For each vmlinux type in our list of relocations, check its
compatibility with the associated vmlinux type. This is
somewhat akin to the CO-RE compatibility checks. Exact rules
would need to be ironed out, but a somewhat loose approach
would be ideal such that a few minor changes in a struct
somewhere do not totally invalidate module BTF. Unlike CO-RE
though, field offset changes are _not_ good since they imply the
module has an incorrect view of the structure and might
start using fields incorrectly.
Note that this is a bit easier than BTF deduplication, because
the deduplication process that happened at module encoding time
has already done the dependency checking for us; we just need
to do a type-by-type, 1-to-1 comparison between our relocation
types and current vmlinux types.
3. If all types are consistent, BTF is loaded and we remap the
module's vmlinux BTF id references to the corresponding
vmlinux BTF ids of the current vmlinux.
I _think_ this gets us what we want; more resilient module BTF,
but with safety checks to ensure compatible representations.
There were some suggestions of using a hashing method, but I think
such a method presupposes we want exact type matches, which I suspect
would be unlikely to be useful in practice as with most stable-based
distros, small changes in types can be made due to fixes etc.
There were also a suggestion of doing a full dedup, but I think the
consensus in the room (which I agree with) is that would be hard
to do in-kernel. So the above approach is a compropmise I think;
it gets actual dedup at BTF creation time to create the list of
references and dependents, and we later check them one-by-one on module
load for compatibility.
Anyway I just wanted to try and capture the feedback received, and
lay out a possible direction. Any further thoughts or suggestions
would be much appreciated. Thanks!
Alan
[1] https://lpc.events/event/17/contributions/1576/
next prev parent reply other threads:[~2023-11-14 20:20 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-12 12:48 [PATCH v4 bpf-next 00/17] Add kind layout, CRCs to BTF Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 01/17] btf: add kind layout encoding, crcs to UAPI Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 02/17] libbpf: support kind layout section handling in BTF Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 03/17] libbpf: use kind layout to compute an unknown kind size Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 04/17] libbpf: add kind layout encoding, crc support Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 05/17] libbpf: BTF validation can use kind layout for unknown kinds Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 06/17] btf: support kernel parsing of BTF with kind layout Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 07/17] bpf: add BTF CRC verification where present Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 08/17] bpf: verify base BTF CRC to ensure it matches module BTF Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 09/17] kbuild, bpf: switch to --btf_features, add crc,kind_layout features Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 10/17] bpftool: add BTF dump "format meta" to dump header/metadata Alan Maguire
2023-11-14 5:10 ` Quentin Monnet
2023-11-15 8:45 ` Alan Maguire
2023-11-15 14:51 ` Quentin Monnet
2023-11-16 9:16 ` Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 11/17] bpftool: update doc to describe bpftool btf dump .. format meta Alan Maguire
2023-11-14 5:12 ` Quentin Monnet
2023-11-12 12:48 ` [PATCH v4 bpf-next 12/17] selftests/bpf: test kind encoding/decoding Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 13/17] bpf: support standalone BTF in modules Alan Maguire
2023-11-12 15:35 ` kernel test robot
2023-11-12 20:00 ` kernel test robot
2023-11-12 12:48 ` [PATCH v4 bpf-next 14/17] kbuild, bpf: allow opt-out from using split BTF for modules Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 15/17] selftests/bpf: generalize module load to support specifying a module name Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 16/17] selftests/bpf: build separate bpf_testmod module with standalone BTF Alan Maguire
2023-11-12 12:48 ` [PATCH v4 bpf-next 17/17] selftests/bpf: update btf_module test to ensure standalone BTF works Alan Maguire
2023-11-14 20:16 ` Alan Maguire [this message]
2023-11-21 19:44 ` [PATCH v4 bpf-next 00/17] Add kind layout, CRCs to BTF Andrii Nakryiko
2023-11-22 17:00 ` Alan Maguire
2023-11-22 17:42 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f546e2bf-982b-62cd-b2d4-88760d4d97d7@oracle.com \
--to=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=masahiroy@kernel.org \
--cc=quentin@isovalent.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox