public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Yonghong Song <yonghong.song@linux.dev>
Cc: bpf@vger.kernel.org, Nick Desaulniers <ndesaulniers@google.com>
Subject: Re: Usage of "p" constraint in BPF inline asm
Date: Thu, 10 Aug 2023 19:39:38 +0200	[thread overview]
Message-ID: <87a5uyiyp1.fsf@oracle.com> (raw)
In-Reply-To: <a4c550e4-1d65-aace-d9ba-820b89390f54@linux.dev> (Yonghong Song's message of "Thu, 10 Aug 2023 10:26:46 -0700")


> On 8/10/23 3:35 AM, Jose E. Marchesi wrote:
>> Hello.
>> We found that some of the BPF selftests use the "p" constraint in
>> inline
>> assembly snippets, for input operands for MOV (rN = rM) instructions.
>> This is mainly done via the __imm_ptr macro defined in
>> tools/testing/selftests/bpf/progs/bpf_misc.h:
>>    #define __imm_ptr(name) [name]"p"(&name)
>> Example:
>>    int consume_first_item_only(void *ctx)
>>    {
>>          struct bpf_iter_num iter;
>>          asm volatile (
>>                  /* create iterator */
>>                  "r1 = %[iter];"
>>                  [...]
>>                  :
>>                  : __imm_ptr(iter)
>>                  : CLOBBERS);
>>          [...]
>>    }
>> Little equivalent reproducer:
>>    int bar ()
>>    {
>>      int jorl;
>>      asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));
>>      return jorl;
>>    }
>> The "p" constraint is a tricky one.  It is documented in the GCC
>> manual
>> section "Simple Constraints":
>>    An operand that is a valid memory address is allowed.  This is
>> for
>>    ``load address'' and ``push address'' instructions.
>>    p in the constraint must be accompanied by address_operand as the
>>    predicate in the match_operand.  This predicate interprets the mode
>>    specified in the match_operand as the mode of the memory reference for
>>    which the address would be valid.
>> There are two problems:
>> 1. It is questionable whether that constraint was ever intended to
>> be
>>     used in inline assembly templates, because its behavior really
>>     depends on compiler internals.  A "memory address" is not the same
>>     than a "memory operand" or a "memory reference" (constraint "m"), and
>>     in fact its usage in the template above results in an error in both
>>     x86_64-linux-gnu and bpf-unkonwn-none:
>>       foo.c: In function ‘bar’:
>>       foo.c:6:3: error: invalid 'asm': invalid expression as operand
>>          6 |   asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl));
>>            |   ^~~
>>     I would assume the same happens with aarch64, riscv, and
>> most/all
>>     other targets in GCC, that do not accept operands of the form A + B
>>     that are not wrapped either in a const or in a memory reference.
>>     To avoid that error, the usage of the "p" constraint in internal
>> GCC
>>     instruction templates is supposed to be complemented by the 'a'
>>     modifier, like in:
>>       asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));
>>     Internally documented (in GCC's final.cc) as:
>>       %aN means expect operand N to be a memory address
>>          (not a memory reference!) and print a reference
>>          to that address.
>>     That works because when the modifier 'a' is found, GCC prints an
>>     "operand address", which is not the same than an "operand".
>>     But...
>> 2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN
>> =
>>     rM' instruction really requires a register argument.  In cases
>>     involving automatics, like in the examples above, we easily end with:
>>       bar:
>>          #APP
>>              r1 = r10-4
>>          #NO_APP
>>     In other cases we could conceibly also end with a 64-bit label
>> that
>>     may overflow the 32-bit immediate operand of `rN = imm32'
>>     instructions:
>>          r1 = foo
>>     All of which is clearly wrong.
>> clang happens to do "the right thing" in the current usage of
>> __imm_ptr
>> in the BPF tests, because even with -O2 it seems to "reload" the
>> fp-relative address of the automatic to a register like in:
>>    bar:
>> 	r1 = r10
>> 	r1 += -4
>> 	#APP
>> 	r1 = r1
>> 	#NO_APP
>
> Unfortunately, the modifier 'a' won't work for clang.
>
> $ cat t.c  int bar ()  {     int jorl;     asm volatile ("r1 =
> %a[jorl]" : : [jorl]"p"(&jorl));     return jorl;  }  $ gcc -O2 -g -S
> t.c  $ clang --target=bpf -O2 -g -S t.c  clang:
> ../lib/Target/BPF/BPFAsmPrinter.cpp:126: virtual bool
> {anonymous}::BPFAsmPrinter::PrintAsmMemoryOperand(const
> llvm::MachineInstr*, unsigned int, const char*, llvm::raw_ostream&):
> Assertion `Offs
> etMO.isImm() && "Unexpected offset for inline asm memory operand."' failed.
> ...
>
> I guess BPF backend can try to add support for this 'a' modifier
> if necessary.

I wouldn't advise that: it is an internal GCC detail that just happens
to work in inline asm.  Also, even if you did that constraint may result
in operands that are not single registers.  It would be better to use
"r" constraint instead.

>
>> Which is what GCC would generate with -O0.  Whether this is by chance or
>> by design (Nick, do you know?) I don't think the compiler should be
>> expected to do that reload driven by the "p" constraint.
>> I would suggest to change that macro (and similar out of macro
>> usages of
>> the "p" constraint in selftests/bpf/progs/iters.c) to use the "r"
>> constraint instead.  If a register is what is required, we should let
>> the compiler know.
>
> Could you specify what is the syntax ("r" constraint) which will work
> for both clang and gcc?

Instead of:

   #define __imm_ptr(name) [name]"p"(&name)

Use this:

   #define __imm_ptr(name) [name]"r"(&name)

That assures that the operand (the pointer value) will be available in
the form of a single register.

>
>> Thoughts?
>> PS: I am aware that the x86 port of the kernel uses the "p"
>> constraint
>>      in the percpu macros (arch/x86/include/asm/percpu.h) but that usage
>>      is in a different context (I would assume it is used in x86
>>      instructions that get constant addresses or global addresses loaded
>>      in registers and not automatics) where it seems to work well.
>> 

  reply	other threads:[~2023-08-10 17:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-10 10:35 Usage of "p" constraint in BPF inline asm Jose E. Marchesi
2023-08-10 17:26 ` Yonghong Song
2023-08-10 17:39   ` Jose E. Marchesi [this message]
2023-08-10 17:45     ` Yonghong Song
2023-08-10 19:01       ` Eduard Zingerman
2023-08-10 19:10         ` Jose E. Marchesi
2023-08-10 19:38           ` Eduard Zingerman
2023-08-11 12:01             ` Jose E. Marchesi
2023-08-11 12:18               ` Eduard Zingerman
2023-08-11 12:27                 ` Eduard Zingerman
2023-08-11 14:10                 ` Jose E. Marchesi
2023-08-11 16:12                   ` Eduard Zingerman
2023-08-11 17:22                     ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5uyiyp1.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox