BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: "Jose E. Marchesi" <jemarch@gnu.org>
Cc: "Jose E. Marchesi" <jose.marchesi@oracle.com>, bpf@vger.kernel.org
Subject: Re: BPF GCC status - Nov 2023
Date: Thu, 30 Nov 2023 09:39:08 -0800	[thread overview]
Message-ID: <840e33ec-ea4c-4b55-bda1-0be8d1e0324f@linux.dev> (raw)
In-Reply-To: <87v89j8emi.fsf@gnu.org>


On 11/30/23 10:06 AM, Jose E. Marchesi wrote:
>> On 11/30/23 7:13 AM, Jose E. Marchesi wrote:
>>>> On 11/29/23 2:08 AM, Jose E. Marchesi wrote:
>>>>>> On 11/28/23 11:23 AM, Jose E. Marchesi wrote:
>>>>>>> [During LPC 2023 we talked about improving communication between the GCC
>>>>>>>      BPF toolchain port and the kernel side.  This is the first periodical
>>>>>>>      report that we plan to publish in the GCC wiki and send to interested
>>>>>>>      parties.  Hopefully this will help.]
>>>>>>>
>>>>>>> GCC wiki page for the port: https://gcc.gnu.org/wiki/BPFBackEnd
>>>>>>> IRC channel: #gccbpf at irc.oftc.net.
>>>>>>> Help on using the port: gcc@gcc.gnu.org
>>>>>>> Patches and/or development discussions: gcc-patches@gnu.org
>>>>>> Thanks a lot for detailed report. Really helpful to nail down
>>>>>> issues facing one or both compilers. See comments below for
>>>>>> some mentioned issues.
>>>>>>
>>>>>>> Assembler
>>>>>>> =========
>>>>>> [...]
>>>>>>
>>>>>>> - In the Pseudo-C syntax register names are not preceded by % characters
>>>>>>>       nor any other prefix.  A consequence of that is that in contexts like
>>>>>>>       instruction operands, where both register names and expressions
>>>>>>>       involving symbols are expected, there is no way to disambiguate
>>>>>>>       between them.  GAS was allowing symbols like `w3' or `r5' in syntactic
>>>>>>>       contexts where no registers were expected, such as in:
>>>>>>>
>>>>>>>         r0 = w3 ll  ; GAS interpreted w3 as symbol, clang emits error
>>>>>>>
>>>>>>>       The clang assembler wasn't allowing that.  During LPC we agreed that
>>>>>>>       the simplest approach is to not allow any symbol to have the same name
>>>>>>>       than a register, in any context.  So we changed GAS so it now doesn't
>>>>>>>       allow to use register names as symbols in any expression, such as:
>>>>>>>
>>>>>>>         r0 = w3 + 1 ll  ; This now fails for both GAS and llvm.
>>>>>>>         r0 = 1 + w3 ll  ; NOTE this does not fail with llvm, but it should.
>>>>>> Could you provide a reproducible case above for llvm? llvm does not
>>>>>> support syntax like 'r0 = 1 + w3 ll'. For add, it only supports
>>>>>> 'r1 += r2' or 'r1 += 100' syntax.
>>>>> It is a 128-bit load with an expression.  In compiler explorer, clang:
>>>>>
>>>>>      int
>>>>>      foo ()
>>>>>      {
>>>>>        asm volatile ("r1 = 10 + w3 ll");
>>>>>        return 0;
>>>>>      }
>>>>>
>>>>> I get:
>>>>>
>>>>>      foo:                                    # @foo
>>>>>              r1 = 10+w3 ll
>>>>>              r0 = 0
>>>>>              exit
>>>>>
>>>>> i.e. `10 + w3' is interpreted as an expression with two operands: the
>>>>> literal number 10 and a symbol (not a register) `w3'.
>>>>>
>>>>> If the expression is `w3+10' instead, your parser recognizes the w3 as a
>>>>> register name and errors out, as expected.
>>>>>
>>>>> I suppose llvm allows to hook on the expression parser to handle
>>>>> individual operands.  That's how we handled this in GAS.
>>>> Thanks for the code. I can reproduce the result with compiler explorer.
>>>> The following is the link https://godbolt.org/z/GEGexf1Pj
>>>> where I added -grecord-gcc-switches to dump compilation flags
>>>> into .s file.
>>>>
>>>> The following is the compiler explorer compilation command line:
>>>> /opt/compiler-explorer/clang-trunk-20231129/bin/clang-18 -g -o /app/output.s \
>>>>     -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 \
>>>>     -g -grecord-command-line /app/example.c
>>>>
>>>> I then compile the above C code with
>>>>     clang -g -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 -g -grecord-command-line t.c
>>>> with identical flags.
>>>>
>>>> I tried locally with llvm16/17/18. They all failed compilation since
>>>> 'r1 = 10+w3 ll' cannot be recognized by the llvm.
>>>> We will investigate why llvm18 in compiler explorer compiles
>>>> differently from my local build.
>>> I updated git llvm master today and I managed to reproduce locally with:
>>>
>>> jemarch@termi:~/gnu/src/llvm-project/llvm/build$ clang --version
>>> clang version 18.0.0 (https://github.com/llvm/llvm-project.git 586986a063ee4b9a7490aac102e103bab121c764)
>>> Target: unknown
>>> Thread model: posix
>>> InstalledDir: /usr/local/bin
>>> $ cat foo.c
>>>       int
>>>       foo ()
>>>       {
>>>         asm volatile ("r1 = 10 + w3 ll");
>>>         return 0;
>>>       }
>>> $ clang -target bpf -c foo.c
>>> $ llvm-objdump -dr foo.o
>>>
>>> foo.o:	file format elf64-bpf
>>>
>>> Disassembly of section .text:
>>>
>>> 0000000000000000 <foo>:
>>>          0:	18 01 00 00 0a 00 00 00 00 00 00 00 00 00 00 00	r1 = 0xa ll
>>> 		0000000000000000:  R_BPF_64_64	w3
>>>          2:	b7 00 00 00 00 00 00 00	r0 = 0x0
>>>          3:	95 00 00 00 00 00 00 00	exit
>> Could you share the cmake command line options when you build you clang?
>> My cmake command line looks like
>> cmake .. -DCMAKE_BUILD_TYPE=Release -G Ninja \
>>      -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
>>      -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
>>      -DLLVM_ENABLE_ASSERTIONS=ON \
>>      -DLLVM_ENABLE_ZLIB=ON \
>>      -DCMAKE_INSTALL_PREFIX=$PWD/install
>>
>> and cannot reproduce the issue.
> I don't have the original cmake command, I executed it long ago
> (rebuilding clang/llvm in my laptop takes three days or more so I do it
> incrementally.)
>
> I see this in my CMakeCache.txt:
>
>    LLVM_ENABLE_PROJECTS:STRING=clang
>    LLVM_TARGETS_TO_BUILD:STRING=BPF
>    LLVM_ENABLE_ASSERTIONS:BOOL=OFF
>    LLVM_ENABLE_ZLIB:STRING=ON
>    CMAKE_INSTALL_PREFIX:PATH=/usr/local

Thanks for your cmake command line options. Looks like the reason is due to LLVM_ENABLE_ASSERTIONS=OFF while in my case
LLVM_ENABLE_ASSERTIONS=ON.

The related function in llvm is:

static void printExpr(const MCExpr *Expr, raw_ostream &O) {
#ifndef NDEBUG
   const MCSymbolRefExpr *SRE;

   if (const MCBinaryExpr *BE = dyn_cast<MCBinaryExpr>(Expr))
     SRE = dyn_cast<MCSymbolRefExpr>(BE->getLHS());
   else
     SRE = dyn_cast<MCSymbolRefExpr>(Expr);
   assert(SRE && "Unexpected MCExpr type.");

   MCSymbolRefExpr::VariantKind Kind = SRE->getKind();

   assert(Kind == MCSymbolRefExpr::VK_None);
#endif
   O << *Expr;
}

If LLVM_ENABLE_ASSERTIONS=ON, NDEBUG will not be defined
and 'assert' will actually do assertion.
If LLVM_ENABLE_ASSERTIONS=OFF, NDEBUG will be defined
and 'assert' will be a noop.

That is why ASSERTIONS OFF flag is okay while ASSERTIONS ON
will cause the following error:
    $ clang --target=bpf -g -S -O2 t.c
    clang: ../lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp:46: void printExpr(const llvm::MCExpr*, llvm::raw_ostream&):
        Assertion `SRE && "Unexpected MCExpr type."' failed.
    .... stack trace etc. ....

I also tried with my local redhat built clang15 and it didn't produce error either.
   $ /bin/clang --target=bpf -g -S -O2 t.c
   $ rpm -qf /bin/clang
   clang-16.0.6-1.el9.x86_64
Looks like their cmake options does not have LLVM_ENABLE_ASSERTIONS at all
which I assume is OFF. See
   https://gitlab.com/redhat/centos-stream/rpms/clang/-/blob/4fcf8241b99430ba239c5461b962fea1f3107a22/clang.spec

clang really does not support this syntax:
    r1 = 10 + w3 ll

The following clang patch will emit error regardless of LLVM_ENABLE_ASSERTIONS value.

======

diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp b/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp
index 15ab55f95e69..c266538bec73 100644
--- a/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp
+++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp
@@ -36,15 +36,16 @@ void BPFInstPrinter::printInst(const MCInst *MI, uint64_t Address,
  }
  
  static void printExpr(const MCExpr *Expr, raw_ostream &O) {
-#ifndef NDEBUG
    const MCSymbolRefExpr *SRE;
  
    if (const MCBinaryExpr *BE = dyn_cast<MCBinaryExpr>(Expr))
      SRE = dyn_cast<MCSymbolRefExpr>(BE->getLHS());
    else
      SRE = dyn_cast<MCSymbolRefExpr>(Expr);
-  assert(SRE && "Unexpected MCExpr type.");
+  if (!SRE)
+    report_fatal_error("Unexpected MCExpr type.");
  
+#ifndef NDEBUG
    MCSymbolRefExpr::VariantKind Kind = SRE->getKind();
  
    assert(Kind == MCSymbolRefExpr::VK_None);

=======


  reply	other threads:[~2023-11-30 17:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28 16:23 BPF GCC status - Nov 2023 Jose E. Marchesi
2023-11-29  5:50 ` Yonghong Song
2023-11-29  7:08   ` Jose E. Marchesi
2023-11-29 16:44     ` Yonghong Song
2023-11-29 17:01       ` Alexei Starovoitov
2023-11-29 17:44         ` Yonghong Song
2023-11-30 12:13       ` Jose E. Marchesi
2023-11-30 14:58         ` Yonghong Song
2023-11-30 15:06           ` Jose E. Marchesi
2023-11-30 17:39             ` Yonghong Song [this message]
2023-11-30 18:27 ` Andrii Nakryiko
2023-11-30 19:49   ` Jose E. Marchesi
2023-12-01 21:38     ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=840e33ec-ea4c-4b55-bda1-0be8d1e0324f@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=jemarch@gnu.org \
    --cc=jose.marchesi@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox