From: "Jose E. Marchesi" <jemarch@gnu.org>
To: Yonghong Song <yonghong.song@linux.dev>
Cc: "Jose E. Marchesi" <jose.marchesi@oracle.com>, bpf@vger.kernel.org
Subject: Re: BPF GCC status - Nov 2023
Date: Thu, 30 Nov 2023 16:06:29 +0100 [thread overview]
Message-ID: <87v89j8emi.fsf@gnu.org> (raw)
In-Reply-To: <b1b003f0-dfa7-434f-a03a-1c9e2a21c3bf@linux.dev> (Yonghong Song's message of "Thu, 30 Nov 2023 06:58:56 -0800")
> On 11/30/23 7:13 AM, Jose E. Marchesi wrote:
>>> On 11/29/23 2:08 AM, Jose E. Marchesi wrote:
>>>>> On 11/28/23 11:23 AM, Jose E. Marchesi wrote:
>>>>>> [During LPC 2023 we talked about improving communication between the GCC
>>>>>> BPF toolchain port and the kernel side. This is the first periodical
>>>>>> report that we plan to publish in the GCC wiki and send to interested
>>>>>> parties. Hopefully this will help.]
>>>>>>
>>>>>> GCC wiki page for the port: https://gcc.gnu.org/wiki/BPFBackEnd
>>>>>> IRC channel: #gccbpf at irc.oftc.net.
>>>>>> Help on using the port: gcc@gcc.gnu.org
>>>>>> Patches and/or development discussions: gcc-patches@gnu.org
>>>>> Thanks a lot for detailed report. Really helpful to nail down
>>>>> issues facing one or both compilers. See comments below for
>>>>> some mentioned issues.
>>>>>
>>>>>> Assembler
>>>>>> =========
>>>>> [...]
>>>>>
>>>>>> - In the Pseudo-C syntax register names are not preceded by % characters
>>>>>> nor any other prefix. A consequence of that is that in contexts like
>>>>>> instruction operands, where both register names and expressions
>>>>>> involving symbols are expected, there is no way to disambiguate
>>>>>> between them. GAS was allowing symbols like `w3' or `r5' in syntactic
>>>>>> contexts where no registers were expected, such as in:
>>>>>>
>>>>>> r0 = w3 ll ; GAS interpreted w3 as symbol, clang emits error
>>>>>>
>>>>>> The clang assembler wasn't allowing that. During LPC we agreed that
>>>>>> the simplest approach is to not allow any symbol to have the same name
>>>>>> than a register, in any context. So we changed GAS so it now doesn't
>>>>>> allow to use register names as symbols in any expression, such as:
>>>>>>
>>>>>> r0 = w3 + 1 ll ; This now fails for both GAS and llvm.
>>>>>> r0 = 1 + w3 ll ; NOTE this does not fail with llvm, but it should.
>>>>> Could you provide a reproducible case above for llvm? llvm does not
>>>>> support syntax like 'r0 = 1 + w3 ll'. For add, it only supports
>>>>> 'r1 += r2' or 'r1 += 100' syntax.
>>>> It is a 128-bit load with an expression. In compiler explorer, clang:
>>>>
>>>> int
>>>> foo ()
>>>> {
>>>> asm volatile ("r1 = 10 + w3 ll");
>>>> return 0;
>>>> }
>>>>
>>>> I get:
>>>>
>>>> foo: # @foo
>>>> r1 = 10+w3 ll
>>>> r0 = 0
>>>> exit
>>>>
>>>> i.e. `10 + w3' is interpreted as an expression with two operands: the
>>>> literal number 10 and a symbol (not a register) `w3'.
>>>>
>>>> If the expression is `w3+10' instead, your parser recognizes the w3 as a
>>>> register name and errors out, as expected.
>>>>
>>>> I suppose llvm allows to hook on the expression parser to handle
>>>> individual operands. That's how we handled this in GAS.
>>> Thanks for the code. I can reproduce the result with compiler explorer.
>>> The following is the link https://godbolt.org/z/GEGexf1Pj
>>> where I added -grecord-gcc-switches to dump compilation flags
>>> into .s file.
>>>
>>> The following is the compiler explorer compilation command line:
>>> /opt/compiler-explorer/clang-trunk-20231129/bin/clang-18 -g -o /app/output.s \
>>> -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 \
>>> -g -grecord-command-line /app/example.c
>>>
>>> I then compile the above C code with
>>> clang -g -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 -g -grecord-command-line t.c
>>> with identical flags.
>>>
>>> I tried locally with llvm16/17/18. They all failed compilation since
>>> 'r1 = 10+w3 ll' cannot be recognized by the llvm.
>>> We will investigate why llvm18 in compiler explorer compiles
>>> differently from my local build.
>> I updated git llvm master today and I managed to reproduce locally with:
>>
>> jemarch@termi:~/gnu/src/llvm-project/llvm/build$ clang --version
>> clang version 18.0.0 (https://github.com/llvm/llvm-project.git 586986a063ee4b9a7490aac102e103bab121c764)
>> Target: unknown
>> Thread model: posix
>> InstalledDir: /usr/local/bin
>> $ cat foo.c
>> int
>> foo ()
>> {
>> asm volatile ("r1 = 10 + w3 ll");
>> return 0;
>> }
>> $ clang -target bpf -c foo.c
>> $ llvm-objdump -dr foo.o
>>
>> foo.o: file format elf64-bpf
>>
>> Disassembly of section .text:
>>
>> 0000000000000000 <foo>:
>> 0: 18 01 00 00 0a 00 00 00 00 00 00 00 00 00 00 00 r1 = 0xa ll
>> 0000000000000000: R_BPF_64_64 w3
>> 2: b7 00 00 00 00 00 00 00 r0 = 0x0
>> 3: 95 00 00 00 00 00 00 00 exit
>
> Could you share the cmake command line options when you build you clang?
> My cmake command line looks like
> cmake .. -DCMAKE_BUILD_TYPE=Release -G Ninja \
> -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
> -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
> -DLLVM_ENABLE_ASSERTIONS=ON \
> -DLLVM_ENABLE_ZLIB=ON \
> -DCMAKE_INSTALL_PREFIX=$PWD/install
>
> and cannot reproduce the issue.
I don't have the original cmake command, I executed it long ago
(rebuilding clang/llvm in my laptop takes three days or more so I do it
incrementally.)
I see this in my CMakeCache.txt:
LLVM_ENABLE_PROJECTS:STRING=clang
LLVM_TARGETS_TO_BUILD:STRING=BPF
LLVM_ENABLE_ASSERTIONS:BOOL=OFF
LLVM_ENABLE_ZLIB:STRING=ON
CMAKE_INSTALL_PREFIX:PATH=/usr/local
next prev parent reply other threads:[~2023-11-30 15:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-28 16:23 BPF GCC status - Nov 2023 Jose E. Marchesi
2023-11-29 5:50 ` Yonghong Song
2023-11-29 7:08 ` Jose E. Marchesi
2023-11-29 16:44 ` Yonghong Song
2023-11-29 17:01 ` Alexei Starovoitov
2023-11-29 17:44 ` Yonghong Song
2023-11-30 12:13 ` Jose E. Marchesi
2023-11-30 14:58 ` Yonghong Song
2023-11-30 15:06 ` Jose E. Marchesi [this message]
2023-11-30 17:39 ` Yonghong Song
2023-11-30 18:27 ` Andrii Nakryiko
2023-11-30 19:49 ` Jose E. Marchesi
2023-12-01 21:38 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v89j8emi.fsf@gnu.org \
--to=jemarch@gnu.org \
--cc=bpf@vger.kernel.org \
--cc=jose.marchesi@oracle.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox