From: Eduard Zingerman <eddyz87@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>,
bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, kernel-team@fb.com,
yhs@fb.com, arnaldo.melo@gmail.com
Subject: Re: [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h
Date: Thu, 27 Oct 2022 02:54:33 +0300 [thread overview]
Message-ID: <eaf0f6a5dd6a8b2665b84770f83b3db577abd72e.camel@gmail.com> (raw)
In-Reply-To: <81a01604-83db-25a2-d8e0-af1607ac30a9@oracle.com>
On Wed, 2022-10-26 at 12:10 +0100, Alan Maguire wrote:
> On 25/10/2022 23:27, Eduard Zingerman wrote:
> > Hi BPF community,
> >
> > AFAIK there is a long standing feature request to use kernel headers
> > alongside `vmlinux.h` generated by `bpftool`. For example significant
> > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to
> > clang, unfortunately this effort was stuck due to concerns regarding C
> > language semantics.
> >
>
> Thanks for the details! It's a tricky problem to solve.
>
> Before diving into this, can I ask if there's another way round this;
> is there no way we could teach vmlinux.h generation which types to
> skip via some kind of bootstrapping process?
>
> For a bpf object foo.bpf.c that wants to include linux/tcp.h and
> vmlinux.h, something like this seems possible;
>
> 1. run the preprocessor (gcc -E) over the BPF program to generate
> a bootstrap header foo.bpf.exclude_types.h consisting of all the types mentioned
> in it and associated includes, but not in vmlinux.h It would need to -D__VMLINUX_H__
> to avoid including vmlinux.h definitions and -D__BPF_EXCLUDE_TYPE_BOOTSTRAP__ to
> skip the programs themselves, which would need a guard around them I think:
>
> #include <stddef.h>
> #include <stdbool.h>
> #include <vmlinux.h>
> #include <linux/tcp.h>
>
> #ifndef __BPF_EXCLUDE_TYPE_BOOTSTRAP__
> //rest of program here
> #endif
>
> So as a result of this, we now have a single header file that contains all the types
> that non-vmlinux.h include files define.
>
> 2. now to generate vmlinux.h, pass foo.bpf.exclude_types.h into "bpftool btf dump" as an
> exclude header:
>
> bpftool btf dump --exclude /tmp/foo.bpf.types.h file /sys/kernel/btf/vmlinux format c > vmlinux.h
>
> bpftool would have to parse the exclude header for actual type definitions, spotting struct,
> enum and typedef definitions. This is likely made easier by running the preprocessor
> at least since formatting is probably quite uniform. vmlinux.h could simply emit forward declarations
> for types described both in vmlinux BTF and in the exclude file.
>
> So the build process would be
>
> - start with empty vmlinux.h
> - bootstrap a header consisting of the set of types to exclude via c preprocessor
> - generate new vmlinux.h based on above
> - build bpf program
>
> Build processes for BPF objects already has bootstrapping elements like
> this for generating vmlinux.h and skeletons, so while it's not perfect it might
> be a simpler approach. There may be problems with this I'm not seeing though?
I like the tool based approach more but I heard there were some
reservations about separate tool complicating the build process.
- the tool does not require changes to the kbuild;
- the tool is not limited to uapi headers;
- the tool could imitate the dominating declarations attribute and
address the issue with definitions miss-match between vmlinux.h and
system headers (see my reply to Alexei in the parallel sub-thread).
To support this point it would have to work in a somewhat different
order:
- read/pre-process the input file;
- remove from it the definitions that are also present in kernel BTF;
- prepend vmlinux.h to the beginning of the file.
On the other hand I think that implementation would need a C parser /
type analysis step. Thus it would be harder to implement it as a part
of bpftool. But the prototype using something like [1] should be simple.
Thanks,
Eduard
[1] https://github.com/inducer/pycparserext
>
> Thanks!
>
> Alan
>
> > After some discussion with Alexei and Yonghong I'd like to request
> > your comments regarding a somewhat brittle and partial solution to
> > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in
> > the generated `vmlinux.h`.
> >
> > The basic idea
> > ---
> >
> > The goal of the patch set is to allow usage of header files from
> > `include/uapi` alongside `vmlinux.h` as follows:
> >
> > #include <uapi/linux/tcp.h>
> > #include "vmlinux.h"
> >
> > This goal is achieved by adding `#ifndef ... #endif` guards in
> > `vmlinux.h` around definitions that originate from the `include/uapi`
> > headers. The guards emitted match the guards used in the original
> > headers. E.g. as follows:
> >
> > include/uapi/linux/tcp.h:
> >
> > #ifndef _UAPI_LINUX_TCP_H
> > #define _UAPI_LINUX_TCP_H
> > ...
> > union tcp_word_hdr {
> > struct tcphdr hdr;
> > __be32 words[5];
> > };
> > ...
> > #endif /* _UAPI_LINUX_TCP_H */
> >
> > vmlinux.h:
> >
> > ...
> > #ifndef _UAPI_LINUX_TCP_H
> >
> > union tcp_word_hdr {
> > struct tcphdr hdr;
> > __be32 words[5];
> > };
> >
> > #endif /* _UAPI_LINUX_TCP_H */
> > ...
> >
> > To get to this state the following steps are necessary:
> > - "header guard" name should be identified for each header file;
> > - the correspondence between data type and it's header guard has to be
> > encoded in BTF;
> > - `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif`
> > brackets.
> >
> > It is not possible to identify header guard names for all uapi headers
> > basing only on the file name. However a simple script could devised to
> > identify the guards basing on the file name and it's content. Thus it
> > is possible to obtain the list of header names with corresponding
> > header guards.
> >
> > The correspondence between type and it's declaration file (header) is
> > available in DWARF as `DW_AT_decl_file` attribute. The
> > `DW_AT_decl_file` can be matched with the list of header guards
> > described above to obtain the header guard name for a specific type.
> >
> > The `pahole` generates BTF using DWARF. It is possible to modify
> > `pahole` to accept the header guards list as an additional parameter
> > and to encode the header guard names in BTF.
> >
> > Implementation details
> > ---
> >
> > Present patch-set implements these ideas as follows:
> > - A parameter `--header_guards_db` is added to `pahole`. If present it
> > points to a file with a list of `<header> <guard>` records.
> > - `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header
> > guard for each type emitted to BTF. If header guard is present it is
> > encoded alongside the type.
> > - Header guards are encoded in BTF as `BTF_DECL_TAG` records with a
> > special prefix. The prefix "header_guard:" is added to a value of
> > such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format
> > changes).
> > - A special script `infer_header_guards.pl` is added as a part of
> > kbuild, it can infer header guard names for each UAPI header basing
> > on the header content.
> > - This script is invoked from `link-vmlinux.sh` prior to BTF
> > generation during kernel build. The output of the script is saved to
> > a file, the file is passed to `pahole` as `--header_guards_db`
> > parameter.
> > - `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each
> > type and to emit `#ifndef FOO_H ... #endif` brackets when
> > "header_guard:" tag is present for a type.
> >
> > Details for each patch in a set:
> > - libbpf: Deduplicate unambigous standalone forward declarations
> > - selftests/bpf: Tests for standalone forward BTF declarations deduplication
> >
> > There is a small number (63 for defconfig) of forward declarations
> > that are not de-duplicated with the main type declaration under
> > certain conditions. This hinders the header guard brackets
> > generation. This patch addresses this de-duplication issue.
> >
> > - libbpf: Support for BTF_DECL_TAG dump in C format
> > - selftests/bpf: Tests for BTF_DECL_TAG dump in C format
> >
> > Currently libbpf does not process BTF_DECL_TAG when btf is dumped in
> > C format. This patch adds a hash table matching btf type ids with a
> > list of decl tags to the struct btf_dump.
> > The `btf_dump_emit_decl_tags` is not necessary for the overall
> > patch-set to function but simplifies testing a bit.
> >
> > - libbpf: Header guards for selected data structures in vmlinux.h
> > - selftests/bpf: Tests for header guards printing in BTF dump
> >
> > Adds option `emit_header_guards` to `struct btf_dump_opts`.
> > When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif`
> > brackets around types for which header guard information is present
> > in BTF.
> >
> > - bpftool: Enable header guards generation
> >
> > Unconditionally enables `emit_header_guards` for BTF dump in C format.
> >
> > - kbuild: Script to infer header guard values for uapi headers
> > - kbuild: Header guards for types from include/uapi/*.h in kernel BTF
> >
> > Adds `scripts/infer_header_guards.pl` and integrates it with
> > `link-vmlinux.sh`.
> >
> > - selftests/bpf: Script to verify uapi headers usage with vmlinux.h
> >
> > Adds a script `test_uapi_headers.py` that tests header guards with
> > vmlinux.h by compiling a simple C snippet. The snippet looks as
> > follows:
> >
> > #include <some_uapi_header.h>
> > #include "vmlinux.h"
> >
> > __attribute__((section("tc"), used))
> > int syncookie_tc(struct __sk_buff *skb) { return 0; }
> >
> > The list of headers to test comes from
> > `tools/testing/selftests/bpf/good_uapi_headers.txt`.
> >
> > - selftests/bpf: Known good uapi headers for test_uapi_headers.py
> >
> > The list of uapi headers that could be included alongside vmlinux.h.
> > The headers are peeked from the following locations:
> > - <headers-export-dir>/linux/*.h
> > - <headers-export-dir>/linux/**/*.h
> > This choice of locations is somewhat arbitrary.
> >
> > - selftests/bpf: script for infer_header_guards.pl testing
> >
> > The test case for `scripts/infer_header_guards.pl`, verifies that
> > header guards can be inferred for all uapi headers.
> >
> > - There is also a patch for dwarves that adds `--header_guards_db`
> > option (see [2]).
> >
> > The `test_uapi_headers.py` is important as it demonstrates the
> > the necessary compiler flags:
> >
> > clang ... \
> > -D__x86_64__ \
> > -Xclang -fwchar-type=short \
> > -Xclang -fno-signed-wchar \
> > -I{exported_kernel_headers}/include/ \
> > ...
> >
> > - `-fwchar-type=short` and `-fno-signed-wchar` had to be added because
> > BPF target uses `int` for `wchar_t` by default and this differs from
> > `vmlinux.h` definition of the type (at least for x86_64).
> > - `__x86_64__` had to be added for uapi headers that include
> > `stddef.h` (the one that is supplied my CLANG itself), in order to
> > define correct sizes for `size_t` and `ptrdiff_t`.
> > - The `{exported_kernel_headers}` stands for exported kernel headers
> > directory (the headers obtained by `make headers_install` or via
> > distribution package).
> >
> > When it works
> > ---
> >
> > The mechanics described above works for a significant number of UAPI
> > headers. For example, for the test case above I chose the headers from
> > the following locations:
> > - linux/*.h
> > - linux/**/*.h
> > There are 759 such headers and for 677 of them the test described
> > above passes.
> >
> > I excluded the headers from the following sub-directories as
> > potentially not interesting:
> >
> > asm rdma video xen
> > asm-generic misc scsi
> > drm mtd sound
> >
> > Thus saving some time for both discussion and CI but the choice is
> > somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all
> > headers) test passes for 834 out of 972 headers.
> >
> > When it breaks
> > ---
> >
> > There several scenarios when this mechanics breaks.
> > Specifically I found the following cases:
> > - When uapi header includes some system header that conflicts with
> > vmlinux.h.
> > - When uapi header itself conflicts with vmlinux.h.
> >
> > Below are examples for both cases.
> >
> > Conflict with system headers
> > ----
> >
> > The following uapi headers:
> > - linux/atmbr2684.h
> > - linux/bpfilter.h
> > - linux/gsmmux.h
> > - linux/icmp.h
> > - linux/if.h
> > - linux/if_arp.h
> > - linux/if_bonding.h
> > - linux/if_pppox.h
> > - linux/if_tunnel.h
> > - linux/ip6_tunnel.h
> > - linux/llc.h
> > - linux/mctp.h
> > - linux/mptcp.h
> > - linux/netdevice.h
> > - linux/netfilter/xt_RATEEST.h
> > - linux/netfilter/xt_hashlimit.h
> > - linux/netfilter/xt_physdev.h
> > - linux/netfilter/xt_rateest.h
> > - linux/netfilter_arp/arp_tables.h
> > - linux/netfilter_arp/arpt_mangle.h
> > - linux/netfilter_bridge.h
> > - linux/netfilter_bridge/ebtables.h
> > - linux/netfilter_ipv4/ip_tables.h
> > - linux/netfilter_ipv6/ip6_tables.h
> > - linux/route.h
> > - linux/wireless.h
> >
> > Include the following system header:
> > - /usr/include/sys/socket.h (all via linux/if.h)
> >
> > The sys/socket.h conflicts with vmlinux.h in:
> > - types: struct iovec, struct sockaddr, struct msghdr, ...
> > - constants: SOCK_STREAM, SOCK_DGRAM, ...
> >
> > However, only two types are actually used:
> > - struct sockaddr
> > - struct sockaddr_storage (used only in linux/mptcp.h)
> >
> > In 'vmlinux.h' this type originates from 'kernel/include/socket.h'
> > (non UAPI header), thus does not have a header guard.
> >
> > The only workaround that I see is to:
> > - define a stub sys/socket.h as follows:
> >
> > #ifndef __BPF_SOCKADDR__
> > #define __BPF_SOCKADDR__
> >
> > /* For __kernel_sa_family_t */
> > #include <linux/socket.h>
> >
> > struct sockaddr {
> > __kernel_sa_family_t sa_family;
> > char sa_data[14];
> > };
> >
> > #endif
> >
> > - hardcode generation of __BPF_SOCKADDR__ bracket for
> > 'struct sockaddr' in vmlinux.h.
> >
> > Another possibility is to move the definition of 'struct sockaddr'
> > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h',
> > but I expect that this won't fly with the mainline as it might break
> > the programs that include both 'linux/socket.h' and 'sys/socket.h'.
> >
> > Conflict with vmlinux.h
> > ----
> >
> > Uapi header:
> > - linux/signal.h
> >
> > Conflict with vmlinux.h in definition of 'struct sigaction'.
> > Defined in:
> > - vmlinux.h: kernel/include/linux/signal_types.h
> > - uapi: kernel/arch/x86/include/asm/signal.h
> >
> > Uapi headers:
> > - linux/tipc_sockets_diag.h
> > - linux/sock_diag.h
> >
> > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'.
> > Defined in:
> > - vmlinux.h: kernel/include/net/sock.h
> > - uapi: kernel/include/uapi/linux/sock_diag.h
> > Constants seem to be unrelated.
> >
> > And so on... I have details for many other headers but omit those for
> > brevity.
> >
> > In conclusion
> > ---
> >
> > Except from the general feasibility I have a few questions:
> > - What UAPI headers are the candidates for such use? If there are some
> > interesting headers currently not working with this patch-set some
> > hacks have to be added (e.g. like with `linux/if.h`).
> > - Is it ok to encode header guards as special `BTF_DECL_TAG` or should
> > I change the BTF format a bit to save some bytes.
> >
> > Thanks,
> > Eduard
> >
> >
> > [1] https://reviews.llvm.org/D111307
> > [clang] __attribute__ bpf_dominating_decl
> > [2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@gmail.com/T/
> > [RFC dwarves] pahole: Save header guard names when
> > --header_guards_db is passed
> >
> > Eduard Zingerman (12):
> > libbpf: Deduplicate unambigous standalone forward declarations
> > selftests/bpf: Tests for standalone forward BTF declarations
> > deduplication
> > libbpf: Support for BTF_DECL_TAG dump in C format
> > selftests/bpf: Tests for BTF_DECL_TAG dump in C format
> > libbpf: Header guards for selected data structures in vmlinux.h
> > selftests/bpf: Tests for header guards printing in BTF dump
> > bpftool: Enable header guards generation
> > kbuild: Script to infer header guard values for uapi headers
> > kbuild: Header guards for types from include/uapi/*.h in kernel BTF
> > selftests/bpf: Script to verify uapi headers usage with vmlinux.h
> > selftests/bpf: Known good uapi headers for test_uapi_headers.py
> > selftests/bpf: script for infer_header_guards.pl testing
> >
> > scripts/infer_header_guards.pl | 191 +++++
> > scripts/link-vmlinux.sh | 13 +-
> > tools/bpf/bpftool/btf.c | 4 +-
> > tools/lib/bpf/btf.c | 178 ++++-
> > tools/lib/bpf/btf.h | 7 +-
> > tools/lib/bpf/btf_dump.c | 232 +++++-
> > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++
> > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++
> > .../selftests/bpf/prog_tests/btf_dump.c | 11 +-
> > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 +
> > .../progs/btf_dump_test_case_header_guards.c | 94 +++
> > .../bpf/test_uapi_header_guards_infer.sh | 33 +
> > .../selftests/bpf/test_uapi_headers.py | 197 +++++
> > 13 files changed, 1816 insertions(+), 12 deletions(-)
> > create mode 100755 scripts/infer_header_guards.pl
> > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt
> > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c
> > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c
> > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh
> > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py
> >
next prev parent reply other threads:[~2022-10-26 23:54 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-25 22:27 [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 01/12] libbpf: Deduplicate unambigous standalone forward declarations Eduard Zingerman
2022-10-27 22:07 ` Andrii Nakryiko
2022-10-31 1:00 ` Eduard Zingerman
2022-10-31 15:49 ` Eduard Zingerman
2022-11-01 17:08 ` Alan Maguire
2022-11-01 17:37 ` Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 02/12] selftests/bpf: Tests for standalone forward BTF declarations deduplication Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 03/12] libbpf: Support for BTF_DECL_TAG dump in C format Eduard Zingerman
2022-10-27 22:36 ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 04/12] selftests/bpf: Tests " Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 05/12] libbpf: Header guards for selected data structures in vmlinux.h Eduard Zingerman
2022-10-27 22:44 ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 06/12] selftests/bpf: Tests for header guards printing in BTF dump Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 07/12] bpftool: Enable header guards generation Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 08/12] kbuild: Script to infer header guard values for uapi headers Eduard Zingerman
2022-10-27 22:51 ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 09/12] kbuild: Header guards for types from include/uapi/*.h in kernel BTF Eduard Zingerman
2022-10-27 18:43 ` Yonghong Song
2022-10-27 18:55 ` Yonghong Song
2022-10-27 22:44 ` Yonghong Song
2022-10-28 0:00 ` Eduard Zingerman
2022-10-28 0:14 ` Mykola Lysenko
2022-10-28 1:23 ` Yonghong Song
2022-10-28 1:21 ` Yonghong Song
2022-10-25 22:27 ` [RFC bpf-next 10/12] selftests/bpf: Script to verify uapi headers usage with vmlinux.h Eduard Zingerman
2022-10-25 22:28 ` [RFC bpf-next 11/12] selftests/bpf: Known good uapi headers for test_uapi_headers.py Eduard Zingerman
2022-10-25 22:28 ` [RFC bpf-next 12/12] selftests/bpf: script for infer_header_guards.pl testing Eduard Zingerman
2022-10-25 23:46 ` [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h Alexei Starovoitov
2022-10-26 22:46 ` Eduard Zingerman
2022-10-26 11:10 ` Alan Maguire
2022-10-26 23:54 ` Eduard Zingerman [this message]
2022-10-27 23:14 ` Andrii Nakryiko
2022-10-28 1:33 ` Yonghong Song
2022-10-28 17:13 ` Andrii Nakryiko
2022-10-28 18:56 ` Yonghong Song
2022-10-28 21:35 ` Andrii Nakryiko
2022-11-01 16:01 ` Alan Maguire
2022-11-01 18:35 ` Alexei Starovoitov
2022-11-01 19:21 ` Eduard Zingerman
2022-11-01 19:44 ` Alexei Starovoitov
2022-11-11 21:55 ` Eduard Zingerman
2022-11-14 7:52 ` Yonghong Song
2022-11-14 21:13 ` Eduard Zingerman
2022-11-14 21:50 ` Alexei Starovoitov
2022-11-16 2:01 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eaf0f6a5dd6a8b2665b84770f83b3db577abd72e.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=arnaldo.melo@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox