From: Alan Maguire <alan.maguire@oracle.com>
To: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
Arnaldo de Melo <acme@redhat.com>,
Quentin Monnet <quentin@isovalent.com>,
Eddy Z <eddyz87@gmail.com>,
mykolal@fb.com, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>,
yonghong.song@linux.dev,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
houtao1@huawei.com, bpf <bpf@vger.kernel.org>,
Masahiro Yamada <masahiroy@kernel.org>,
"Luis R. Rodriguez" <mcgrof@kernel.org>,
Nathan Chancellor <nathan@kernel.org>
Subject: Re: [PATCH v4 bpf-next 00/11] bpf: support resilient split BTF
Date: Fri, 17 May 2024 12:56:36 +0100 [thread overview]
Message-ID: <3fc71c22-cf9d-410a-bcc0-6de0d21b7cda@oracle.com> (raw)
In-Reply-To: <CA+JHD93=ZcVN4GxepbRF6SLorWJjw0gCgJZUYxQG5hxFehdHUw@mail.gmail.com>
On 17/05/2024 12:11, Arnaldo Carvalho de Melo wrote:
> On Fri, May 17, 2024, 7:23 AM Alan Maguire <alan.maguire@oracle.com
> <mailto:alan.maguire@oracle.com>> wrote:
>
> Split BPF Type Format (BTF) provides huge advantages in that kernel
> modules only have to provide type information for types that they do not
> share with the core kernel; for core kernel types, split BTF refers to
> core kernel BTF type ids. So for a STRUCT sk_buff, a module that
> uses that structure (or a pointer to it) simply needs to refer to the
> core kernel type id, saving the need to define the structure and its
> many
> dependents. This cuts down on duplication and makes BTF as compact
> as possible.
>
> However, there is a downside. This scheme requires the references from
> split BTF to base BTF to be valid not just at encoding time, but at use
> time (when the module is loaded). Even a small change in kernel types
> can perturb the type ids in core kernel BTF, and due to pahole's
> parallel processing of compilation units, even an unchanged kernel can
> have different type ids if BTF is re-generated.
>
>
>
> I think it would be informative to mention the recently added
> "reproducible_build" feature, i.e. rephrase to "... if the
> reproducible_build isn't selected via --btf_features..." in the relevant
> documentation.
>
Yeah, sorry this part should have been updated after the
reproducible_build feature landed.
> - Arnaldo
>
> Sent from smartphone, still on my way back home from LSF/MM+BPF
>
> So we have a robustness
> problem for split BTF for cases where a module is not always compiled at
> the same time as the kernel. This problem is particularly acute for
> distros which generally want module builders to be able to compile a
> module for the lifetime of a Linux stable-based release, and have it
> continue to be valid over the lifetime of that release, even as changes
> in data structures (and hence BTF types) accrue. Today it's not
> possible to generate BTF for modules that works beyond the initial
> kernel it is compiled against - kernel bugfixes etc invalidate the split
> BTF references to vmlinux BTF, and BTF is no longer usable for the
> module.
>
> The goal of this series is to provide options to provide additional
> context for cases like this. That context comes in the form of
> distilled base BTF; it stands in for the base BTF, and contains
> information about the types referenced from split BTF, but not their
> full descriptions. The modified split BTF will refer to type ids in
> this .BTF.base section, and when the kernel loads such modules it
> will use that .BTF.base to map references from split BTF to the
> equivalent current vmlinux base BTF types. Once this relocation
> process has succeeded, the module BTF available in /sys/kernel/btf
> will look exactly as if it was built with the current vmlinux;
> references to base types will be fixed up etc.
>
> A module builder - using this series along with the pahole changes -
> can then build a module with distilled base BTF via an out-of-tree
> module build, i.e.
>
> make -C . M=path/2/module
>
> The module will have a .BTF section (the split BTF) and a
> .BTF.base section. The latter is small in size - distilled base
> BTF does not need full struct/union/enum information for named
> types for example. For 2667 modules built with distilled base BTF,
> the average size observed was 1556 bytes (stddev 1563). The overall
> size added to this 2667 modules was 5.3Mb.
>
> Note that for the in-tree modules, this approach is not needed as
> split and base BTF in the case of in-tree modules are always built
> and re-built together.
>
> The series first focuses on generating split BTF with distilled base
> BTF, and provides btf__parse_opts() which allows specification
> of the section name from which to read BTF data, since we now have
> both .BTF and .BTF.base sections that can contain such data.
>
> Then we add support to resolve_btfids for generating the .BTF.ids
> section with reference to the .BTF.base section - this ensures the
> .BTF.ids match those used in the split/base BTF.
>
> Finally the series provides the mechanism for relocating split BTF with
> a new base; the distilled base BTF is used to map the references to base
> BTF in the split BTF to the new base. For the kernel, this relocation
> process happens at module load time, and we relocate split BTF
> references to point at types in the current vmlinux BTF. As part of
> this, .BTF.ids references need to be mapped also.
>
> So concretely, what happens is
>
> - we generate split BTF in the .BTF section of a module that refers to
> types in the .BTF.base section as base types; the latter are not full
> type descriptions but provide information about the base type. So
> a STRUCT sk_buff would be represented as a FWD struct sk_buff in
> distilled base BTF for example.
> - when the module is loaded, the split BTF is relocated with vmlinux
> BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
> in vmlinux BTF and map all split BTF references to the distilled base
> FWD sk_buff, replacing them with references to the vmlinux BTF
> STRUCT sk_buff.
>
> Support is also added to bpftool to be able to display split BTF
> relative to its .BTF.base section, and also to display the relocated
> form via the "-R path_to_base_btf".
>
> A previous approach to this problem [1] utilized standalone BTF for such
> cases - where the BTF is not defined relative to base BTF so there is no
> relocation required. The problem with that approach is that from
> the verifier perspective, some types are special, and having a custom
> representation of a core kernel type that did not necessarily match the
> current representation is not tenable. So the approach taken here was
> to preserve the split BTF model while minimizing the representation of
> the context needed to relocate split and current vmlinux BTF.
>
> To generate distilled .BTF.base sections the associated dwarves
> patch (to be applied on the "next" branch there) is needed.
> Without it, things will still work but modules will not be built
> with a .BTF.base section.
>
> Changes since v3[3]:
>
> - distill now checks for duplicate-named struct/unions and records
> them as a sized struct/union to help identify which of the
> multiple base BTF structs/unions it refers to (Eduard, patch 1)
> - added test support for multiple name handling (Eduard, patch 2)
> - simplified the string mapping when updating split BTF to use
> base BTF instead of distilled base. Since the only string
> references split BTF can make to base BTF are the names of
> the base types, create a string map from distilled string
> offset -> base BTF string offset and update string offsets
> by visiting all strings in split BTF; this saves having to
> do costly searches of base BTF (Eduard, patch 7,10)
> - fixed bpftool manpage and indentation issues (Quentin, patch 11)
>
> Also explored Eduard's suggestion of doing an implicit fallback
> to checking for .BTF.base section in btf__parse() when it is
> called to get base BTF. However while it is doable, it turned
> out to be difficult operationally. Since fallback is implicit
> we do not know the source of the BTF - was it from .BTF or
> .BTF.base? In bpftool, we want to try first standalone BTF,
> then split, then split with distilled base. Having a way
> to explicitly request .BTF.base via btf__parse_opts() fits
> that model better.
>
> Changes since v2[4]:
>
> - submitted patch to use --btf_features in Makefile.btf for pahole
> v1.26 and later separately (Andrii). That has landed in bpf-next
> now.
> - distilled base now encodes ENUM64 as fwd ENUM (size 8), eliminating
> the need for support for ENUM64 in btf__add_fwd (patch 1, Andrii)
> - moved to distilling only named types, augmenting split BTF with
> associated reference types; this simplifies greatly the distilled
> base BTF and the mapping operation between distilled and base
> BTF when relocating (most of the series changes, Andrii)
> - relocation now iterates over base BTF, looking for matches based
> on name in distilled BTF. Distilled BTF is pre-sorted by name
> (Andrii, patch 8)
> - removed most redundant compabitiliby checks aside from struct
> size for base types/embedded structs and kind compatibility
> (since we only match on name) (Andrii, patch 8)
> - btf__parse_opts() now replaces btf_parse() internally in libbpf
> (Eduard, patch 3)
>
> Changes since RFC [5]:
>
> - updated terminology; we replace clunky "base reference" BTF with
> distilling base BTF into a .BTF.base section. Similarly BTF
> reconcilation becomes BTF relocation (Andrii, most patches)
> - add distilled base BTF by default for out-of-tree modules
> (Alexei, patch 8)
> - distill algorithm updated to record size of embedded struct/union
> by recording it as a 0-vlen STRUCT/UNION with size preserved
> (Andrii, patch 2)
> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
> patch 9)
> - with embedded STRUCT/UNION recording size, we can have bpftool
> dump a header representation using .BTF.base + .BTF sections
> rather than special-casing and refusing to use "format c" for
> that case (patch 5)
> - match enum with enum64 and vice versa (Andrii, patch 9)
> - ensure that resolve_btfids works with BTF without .BTF.base
> section (patch 7)
> - update tests to cover embedded types, arrays and function
> prototypes (patches 3, 12)
>
> [1]
> https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/ <https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/>
> [2]
> https://lore.kernel.org/bpf/20240501175035.2476830-1-alan.maguire@oracle.com/ <https://lore.kernel.org/bpf/20240501175035.2476830-1-alan.maguire@oracle.com/>
> [3]
> https://lore.kernel.org/bpf/20240510103052.850012-1-alan.maguire@oracle.com/ <https://lore.kernel.org/bpf/20240510103052.850012-1-alan.maguire@oracle.com/>
> [4]
> https://lore.kernel.org/bpf/20240424154806.3417662-1-alan.maguire@oracle.com/ <https://lore.kernel.org/bpf/20240424154806.3417662-1-alan.maguire@oracle.com/>
> [5]
> https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/ <https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/>
>
> Alan Maguire (11):
> libbpf: add btf__distill_base() creating split BTF with distilled base
> BTF
> selftests/bpf: test distilled base, split BTF generation
> libbpf: add btf__parse_opts() API for flexible BTF parsing
> bpftool: support displaying raw split BTF using base BTF section as
> base
> resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
> used
> kbuild, bpf: add module-specific pahole/resolve_btfids flags for
> distilled base BTF
> libbpf: split BTF relocation
> selftests/bpf: extend distilled BTF tests to cover BTF relocation
> module, bpf: store BTF base pointer in struct module
> libbpf,bpf: share BTF relocate-related code with kernel
> bpftool: support displaying relocated-with-base split BTF
>
> include/linux/btf.h | 45 ++
> include/linux/module.h | 2 +
> kernel/bpf/Makefile | 8 +
> kernel/bpf/btf.c | 166 +++--
> kernel/module/main.c | 5 +-
> scripts/Makefile.btf | 7 +
> scripts/Makefile.modfinal | 4 +-
> .../bpf/bpftool/Documentation/bpftool-btf.rst | 15 +-
> tools/bpf/bpftool/bash-completion/bpftool | 7 +-
> tools/bpf/bpftool/btf.c | 19 +-
> tools/bpf/bpftool/main.c | 14 +-
> tools/bpf/bpftool/main.h | 2 +
> tools/bpf/resolve_btfids/main.c | 28 +-
> tools/lib/bpf/Build | 2 +-
> tools/lib/bpf/btf.c | 605 +++++++++++++-----
> tools/lib/bpf/btf.h | 59 ++
> tools/lib/bpf/btf_common.c | 143 +++++
> tools/lib/bpf/btf_relocate.c | 341 ++++++++++
> tools/lib/bpf/libbpf.map | 3 +
> tools/lib/bpf/libbpf_internal.h | 3 +
> .../selftests/bpf/prog_tests/btf_distill.c | 346 ++++++++++
> 21 files changed, 1612 insertions(+), 212 deletions(-)
> create mode 100644 tools/lib/bpf/btf_common.c
> create mode 100644 tools/lib/bpf/btf_relocate.c
> create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
>
> --
> 2.31.1
>
>
next prev parent reply other threads:[~2024-05-17 11:57 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-17 10:22 [PATCH v4 bpf-next 00/11] bpf: support resilient split BTF Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 01/11] libbpf: add btf__distill_base() creating split BTF with distilled base BTF Alan Maguire
2024-05-21 21:48 ` Andrii Nakryiko
2024-05-22 16:42 ` Alan Maguire
2024-05-22 16:57 ` Andrii Nakryiko
2024-05-22 18:00 ` Kui-Feng Lee
2024-05-17 10:22 ` [PATCH v4 bpf-next 02/11] selftests/bpf: test distilled base, split BTF generation Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 03/11] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
2024-05-21 22:01 ` Andrii Nakryiko
2024-05-17 10:22 ` [PATCH v4 bpf-next 04/11] bpftool: support displaying raw split BTF using base BTF section as base Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 05/11] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 06/11] kbuild, bpf: add module-specific pahole/resolve_btfids flags for distilled base BTF Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 07/11] libbpf: split BTF relocation Alan Maguire
2024-05-21 22:34 ` Andrii Nakryiko
2024-05-23 1:06 ` Kui-Feng Lee
2024-05-17 10:22 ` [PATCH v4 bpf-next 08/11] selftests/bpf: extend distilled BTF tests to cover " Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 09/11] module, bpf: store BTF base pointer in struct module Alan Maguire
2024-05-17 10:22 ` [PATCH v4 bpf-next 10/11] libbpf,bpf: share BTF relocate-related code with kernel Alan Maguire
2024-05-21 22:59 ` Andrii Nakryiko
2024-05-17 10:22 ` [PATCH v4 bpf-next 11/11] bpftool: support displaying relocated-with-base split BTF Alan Maguire
2024-05-22 9:04 ` Quentin Monnet
[not found] ` <CA+JHD93=ZcVN4GxepbRF6SLorWJjw0gCgJZUYxQG5hxFehdHUw@mail.gmail.com>
2024-05-17 11:56 ` Alan Maguire [this message]
2024-05-17 21:09 ` [PATCH v4 bpf-next 00/11] bpf: support resilient " Eduard Zingerman
2024-05-20 9:36 ` Alan Maguire
2024-05-18 2:38 ` Eduard Zingerman
2024-05-21 9:15 ` Alan Maguire
2024-05-21 16:19 ` Eduard Zingerman
2024-05-21 18:54 ` Andrii Nakryiko
2024-05-21 19:08 ` Eduard Zingerman
2024-05-21 22:01 ` Andrii Nakryiko
2024-05-21 22:15 ` Eduard Zingerman
2024-05-21 22:36 ` Andrii Nakryiko
2024-05-22 16:16 ` Alan Maguire
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3fc71c22-cf9d-410a-bcc0-6de0d21b7cda@oracle.com \
--to=alan.maguire@oracle.com \
--cc=acme@redhat.com \
--cc=andrii@kernel.org \
--cc=arnaldo.melo@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=houtao1@huawei.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=masahiroy@kernel.org \
--cc=mcgrof@kernel.org \
--cc=mykolal@fb.com \
--cc=nathan@kernel.org \
--cc=quentin@isovalent.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).