From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: bpf@vger.kernel.org
Subject: BPF GCC status - Nov 2023
Date: Tue, 28 Nov 2023 17:23:06 +0100 [thread overview]
Message-ID: <87leahx2xh.fsf@oracle.com> (raw)
[During LPC 2023 we talked about improving communication between the GCC
BPF toolchain port and the kernel side. This is the first periodical
report that we plan to publish in the GCC wiki and send to interested
parties. Hopefully this will help.]
GCC wiki page for the port: https://gcc.gnu.org/wiki/BPFBackEnd
IRC channel: #gccbpf at irc.oftc.net.
Help on using the port: gcc@gcc.gnu.org
Patches and/or development discussions: gcc-patches@gnu.org
Assembler
=========
- The BPF assembler was sometimes generating spurious symbols. The
problem was that supporting the pseudo-C assembly syntax for BPF makes
it impossible to use the traditional technique of hashing on
mnemonics. Instead, we are forced to attempt parsing entries in our
opcodes table until some instruction template matches. In some cases
this was causing the parser to incorrectly parse part of an
instruction opcode as an expression, which led to the creation of a
new undefined symbol.
David Faust installed a fix for this upstream:
https://sourceware.org/pipermail/binutils/2023-November/130668.html
- gas: change meaning of ; in the BPF assembler.
The clang assembler interprets semicolons as a statement/directive
separator. In the GNU BPF assembler that character was being
interpreted as the beginning of a line comment, as it is usual in
assembly languages. We detected this discrepancy with snippets like:
asm volatile (" \
if r1 >= 0 goto l0_%=; \
r0 = 1; \
r0 += 2; \
l0_%=: exit; \
" ::: __clobber_all);
We installed a patch upstream that makes GAS to behave like the clang
assembler when interpreting semicolons in the assembly programs:
Jose E. Marchesi
https://sourceware.org/pipermail/binutils/2023-November/130867.html
The simulator tests have been updated accordingly:
Jose E. Marchesi
https://sourceware.org/pipermail/gdb-patches/2023-November/204581.html
- In the Pseudo-C syntax register names are not preceded by % characters
nor any other prefix. A consequence of that is that in contexts like
instruction operands, where both register names and expressions
involving symbols are expected, there is no way to disambiguate
between them. GAS was allowing symbols like `w3' or `r5' in syntactic
contexts where no registers were expected, such as in:
r0 = w3 ll ; GAS interpreted w3 as symbol, clang emits error
The clang assembler wasn't allowing that. During LPC we agreed that
the simplest approach is to not allow any symbol to have the same name
than a register, in any context. So we changed GAS so it now doesn't
allow to use register names as symbols in any expression, such as:
r0 = w3 + 1 ll ; This now fails for both GAS and llvm.
r0 = 1 + w3 ll ; NOTE this does not fail with llvm, but it should.
We installed a patch in GAS for this.
Jose E. Marchesi
https://sourceware.org/pipermail/binutils/2023-November/130684.html
- Cupertino Miranda fixed a GAS bug in the parsing of registers in
pseudo-c syntax mode:
https://sourceware.org/pipermail/binutils/2023-November/130732.html
Compiler
========
- Remove bpf-helpers.h.
Now that we are finally able to use the kernel provided bpf_helpers.h
file and associated machinery, there is no longer need to distribute
our own version.
Jose E. Marchesi
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638226.html
- Restore BPF build, always_inline in libgcc
Jose E. Marchesi
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637948.html
- Fix expected regexp in gcc.target/bpf/ldxdw.c test
Jose E. Marchesi
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635892.html
- Fix pseudoc-c asm emitted for *mulsidi3_zeroextend
Jose E. Marchesi
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635896.html
- Corrected condition in core_mark_as_access index.
Cupertino Miranda
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636389.html
- Delayed the removal of the parser enum plugin handler.
Cupertino Miranda
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636388.html
- Force inlining __builtin_memcmp upto data sizes of 1024 bytes.
Cupertino Miranda
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636390.html
- Emit errors for libcalls and builtin-generated libcalls, like clang
does.
Cupertino Miranda
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638117.html
- GCC was emitting funcall external declarations corresponding to
attempted but eventually discarded code. This happened for example
when GCC tried some particular code that got discarded because there
was another more performance alternative. This was a problem with
the BPF instruction set <= v3, because of lack of signed division.
This is now fixed upstream.
Jose E. Marchesi
BZ 109253
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638143.html
- Indu Bhagat is investigating a BTF generation problem that is
resulting in non-anonymous FUNC_PROTO entries, which are not allowed
in BTF and rejected by the BPF loader. This apparently happens when
functions get inlined.
Pending Patches for bpf-next
============================
These are the current patches we still have to submit to bpf@vger for
bpf-next. We are in the process of testing them:
- bpf: add more options for gcc-bpf to selftests/bpf/Makefile
This patch passes the following extra options to BPF_GCC in
GCC_BPF_BUILD_RULE:
-masm=pseudoc
-mco-re
-Wno-unknown-pragmas
-Wno-unused-variable
-Wno-error=attributes
-Wno-error=address-of-packed-member
-Wno-compare-distinct-pointer-types
-fno-strict-aliasing
Most of them disable interpreting certain warnings as errors. Code
like:
#define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))
where `expr' is something like a pointer to a bpf_insn, requires
disabling strict aliasing, which is activated by default with -O2 in
GCC.
- bpf: use r constraint instead of p constraint
This was discussed in bpf@vger and it was decided that we would stop
using the "p" constraint in the BPF kernel selftests. That constraint
is not really meant to be used externally to the compiler.
https://lore.kernel.org/bpf/87edkbnq14.fsf@oracle.com/
- bpf_core_read.h: GCC specific macro for preserve_enum_value
This patch adds a version of the bpf_core_enum_value macro to be used
by GCC. The implementations of CO-RE built-ins in clang and GCC
require different "magical expressions" to be passed to the built-ins.
These macros hide the complexity from the user.
- bpf: avoid VLAs in progs/test_xdp_dynptr.c
In the progs/test_xdp_dynptr.c there are a bunch of VLAs in the
handle_ipv4 and handle_ipv6 functions:
const size_t tcphdr_sz = sizeof(struct tcphdr);
const size_t udphdr_sz = sizeof(struct udphdr);
const size_t ethhdr_sz = sizeof(struct ethhdr);
const size_t iphdr_sz = sizeof(struct iphdr);
const size_t ipv6hdr_sz = sizeof(struct ipv6hdr);
[...]
static __always_inline int handle_ipv6(struct xdp_md *xdp, struct bpf_dynptr *xdp_ptr)
{
__u8 eth_buffer[ethhdr_sz + ipv6hdr_sz + ethhdr_sz];
__u8 ip6h_buffer_tcp[ipv6hdr_sz + tcphdr_sz];
__u8 ip6h_buffer_udp[ipv6hdr_sz + udphdr_sz];
[...]
}
static __always_inline int handle_ipv6(struct xdp_md *xdp, struct bpf_dynptr *xdp_ptr)
{
__u8 eth_buffer[ethhdr_sz + ipv6hdr_sz + ethhdr_sz];
__u8 ip6h_buffer_tcp[ipv6hdr_sz + tcphdr_sz];
__u8 ip6h_buffer_udp[ipv6hdr_sz + udphdr_sz];
[...]
}
In both GCC and clang we are not allowing dynamic stack allocation (we
used to support it in GCC using one register as an auxiliary stack
pointer, but not any longer).
The above code builds with clang but not with GCC:
progs/test_xdp_dynptr.c:79:14: error: BPF does not support dynamic stack allocation
79 | __u8 eth_buffer[ethhdr_sz + iphdr_sz + ethhdr_sz];
| ^~~~~~~~~~
We are guessing that clang turns these arrays from VLAs into normal
statically sized arrays because ethhdr_sz and friends are constant and
set to sizeof, which is always known at compile time. This patch
changes the selftest to use preprocessor constants instead of
variables:
#define tcphdr_sz sizeof(struct tcphdr)
#define udphdr_sz sizeof(struct udphdr)
#define ethhdr_sz sizeof(struct ethhdr)
#define iphdr_sz sizeof(struct iphdr)
#define ipv6hdr_sz sizeof(struct ipv6hdr)
- bpf_helpers.h: define bpf_tail_call_static when building with GCC
- bpf: fix constraint in test_tcpbpf_kern.c
GCC emits a warning:
progs/test_tcpbpf_kern.c:60:9: error: ‘op’ is used uninitialized [-Werror=uninitialized]
when the uninitialized automatic `op' is used with a "+r" constraint
in:
asm volatile (
"%[op] = *(u32 *)(%[skops] +96)"
: [op] "+r"(op)
: [skops] "r"(skops)
:);
The constraint shall be "=r" instead.
Open Questions
==============
- BPF programs including libc headers.
BPF programs run on their own without an operating system or a C
library. Implementing C implies providing certain definitions and
headers, such as stdint.h and stdarg.h. For such targets, known as
"bare metal targets", the compiler has to provide these definitions
and headers in order to implement the language.
GCC provides the following C headers for BPF targets:
float.h
gcov.h
iso646.h
limits.h
stdalign.h
stdarg.h
stdatomic.h
stdbool.h
stdckdint.h
stddef.h
stdfix.h
stdint.h
stdnoreturn.h
syslimits.h
tgmath.h
unwind.h
varargs.h
However, we have found that there is at least one BPF kernel self test
that include glibc headers that, indirectly, include glibc's own
definitions of stdint.h and friends. This leads to compile-time
errors due to conflicting types. We think that including headers from
a glibc built for some host target is very questionable. For example,
in BPF a C `char' is defined to be signed. But if a BPF program
includes glibc headers in an android system, that code will assume an
unsigned char instead.
Other Updates
=============
- Brian Witte has adapted the Waldo 80211 debug/test/trace wireless
analyzer tool to be built with GCC BPF. This includes CI that uses
the latest GCC git version, which is quite useful for us.
https://git.sr.ht/~brianwitte/waldo-80211
- Brian has also published a tested and documented very simple bpf
program example, with the goal of providing an accessible and
instructive example for those interested in BPF development with the
GNU toolchain.
https://git.sr.ht/~brianwitte/gcc-bpf-example
next reply other threads:[~2023-11-28 16:23 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-28 16:23 Jose E. Marchesi [this message]
2023-11-29 5:50 ` BPF GCC status - Nov 2023 Yonghong Song
2023-11-29 7:08 ` Jose E. Marchesi
2023-11-29 16:44 ` Yonghong Song
2023-11-29 17:01 ` Alexei Starovoitov
2023-11-29 17:44 ` Yonghong Song
2023-11-30 12:13 ` Jose E. Marchesi
2023-11-30 14:58 ` Yonghong Song
2023-11-30 15:06 ` Jose E. Marchesi
2023-11-30 17:39 ` Yonghong Song
2023-11-30 18:27 ` Andrii Nakryiko
2023-11-30 19:49 ` Jose E. Marchesi
2023-12-01 21:38 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87leahx2xh.fsf@oracle.com \
--to=jose.marchesi@oracle.com \
--cc=bpf@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox