GCC and binutils support for BPF V4 instructions

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* GCC and binutils support for BPF V4 instructions
@ 2023-07-28 16:41 Jose E. Marchesi
  2023-07-28 16:47 ` Alexei Starovoitov
  2023-07-28 16:59 ` Yonghong Song
  0 siblings, 2 replies; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-28 16:41 UTC (permalink / raw)
  To: bpf


Hello.

Just a heads up regarding the new BPF V4 instructions and their support
in the GNU Toolchain.

V4 sdiv/smod instructions

  Binutils has been updated to use the V4 encoding of these
  instructions, which used to be part of the xbpf testing dialect used
  in GCC.  GCC generates these instructions for signed division when
  -mcpu=v4 or higher.

V4 sign-extending register move instructions
V4 signed load instructions
V4 byte swap instructions

  Supported in assembler, disassembler and linker.  GCC generates these
  instructions when -mcpu=v4 or higher.

V4 32-bit unconditional jump instruction

  Supported in assembler and disassembler.  GCC doesn't generate that
  instruction.

  However, the assembler has been expanded in order to perform the
  following relaxations when the disp16 field of a jump instruction is
  known at assembly time, and is overflown, unless -mno-relax is
  specified:

    JA disp16  -> JAL disp32
    Jxx disp16 -> Jxx +1; JA +1; JAL disp32

  Where Jxx is one of the conditional jump instructions such as jeq,
  jlt, etc.

So I think we are done with this.  Please let us know if these
instructions ever change.

Relevant binutils bugzillas (all now resolved as fixed):

* Make use of long range calls by relaxation (jal/gotol):
  https://sourceware.org/bugzilla/show_bug.cgi?id=30690

Relevant GCC bugzillas (all now resolved as fixed):

* Make use of signed-load instructions:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110782
  
* Make use of signed division/modulus:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110783

* Make use of signed mov instructions:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110784

* Make use of byte swap instructions:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110786

Salud!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 16:41 GCC and binutils support for BPF V4 instructions Jose E. Marchesi
@ 2023-07-28 16:47 ` Alexei Starovoitov
  2023-07-28 17:06   ` Jose E. Marchesi
  2023-07-28 16:59 ` Yonghong Song
  1 sibling, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-07-28 16:47 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: bpf

On Fri, Jul 28, 2023 at 9:41 AM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> Hello.
>
> Just a heads up regarding the new BPF V4 instructions and their support
> in the GNU Toolchain.

Awesome. Thanks for the update!

> V4 sdiv/smod instructions
>
>   Binutils has been updated to use the V4 encoding of these
>   instructions, which used to be part of the xbpf testing dialect used
>   in GCC.  GCC generates these instructions for signed division when
>   -mcpu=v4 or higher.

With sdiv/smod implemented do you still have a need for xbpf flag?
Anything still missing or you can start using -mcpu=v4 in gcc selftests
and remove xbpf completely?

>
> So I think we are done with this.  Please let us know if these
> instructions ever change.

Fingers crossed, they will never change.
How far are we from running bpf selftests with gcc?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 16:47 ` Alexei Starovoitov
@ 2023-07-28 17:06   ` Jose E. Marchesi
  0 siblings, 0 replies; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-28 17:06 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: bpf


> On Fri, Jul 28, 2023 at 9:41 AM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>>
>> Hello.
>>
>> Just a heads up regarding the new BPF V4 instructions and their support
>> in the GNU Toolchain.
>
> Awesome. Thanks for the update!
>
>> V4 sdiv/smod instructions
>>
>>   Binutils has been updated to use the V4 encoding of these
>>   instructions, which used to be part of the xbpf testing dialect used
>>   in GCC.  GCC generates these instructions for signed division when
>>   -mcpu=v4 or higher.
>
> With sdiv/smod implemented do you still have a need for xbpf flag?
> Anything still missing or you can start using -mcpu=v4 in gcc selftests
> and remove xbpf completely?

Just `call %r' (what the clang disassembler calls callx.)

>> So I think we are done with this.  Please let us know if these
>> instructions ever change.
>
> Fingers crossed, they will never change.
> How far are we from running bpf selftests with gcc?

We are getting there, but not quite yet.
See https://gcc.gnu.org/wiki/BPFBackEnd where we track our work.

(The CO-RE builtins entry is basically done, but we are still polishing
 the details before sending the patch to GCC upstream.)
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 16:41 GCC and binutils support for BPF V4 instructions Jose E. Marchesi
  2023-07-28 16:47 ` Alexei Starovoitov
@ 2023-07-28 16:59 ` Yonghong Song
  2023-07-28 17:40   ` Jose E. Marchesi
  1 sibling, 1 reply; 15+ messages in thread
From: Yonghong Song @ 2023-07-28 16:59 UTC (permalink / raw)
  To: Jose E. Marchesi, bpf



On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
> 
> Hello.
> 
> Just a heads up regarding the new BPF V4 instructions and their support
> in the GNU Toolchain.
> 
> V4 sdiv/smod instructions
> 
>    Binutils has been updated to use the V4 encoding of these
>    instructions, which used to be part of the xbpf testing dialect used
>    in GCC.  GCC generates these instructions for signed division when
>    -mcpu=v4 or higher.
> 
> V4 sign-extending register move instructions
> V4 signed load instructions
> V4 byte swap instructions
> 
>    Supported in assembler, disassembler and linker.  GCC generates these
>    instructions when -mcpu=v4 or higher.
> 
> V4 32-bit unconditional jump instruction
> 
>    Supported in assembler and disassembler.  GCC doesn't generate that
>    instruction.
> 
>    However, the assembler has been expanded in order to perform the
>    following relaxations when the disp16 field of a jump instruction is
>    known at assembly time, and is overflown, unless -mno-relax is
>    specified:
> 
>      JA disp16  -> JAL disp32
>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
> 
>    Where Jxx is one of the conditional jump instructions such as jeq,
>    jlt, etc.

Sounds great. The above 'JA/Jxx disp16' transformation matches
what llvm did as well.

> 
> So I think we are done with this.  Please let us know if these
> instructions ever change.
> 
> Relevant binutils bugzillas (all now resolved as fixed):
> 
> * Make use of long range calls by relaxation (jal/gotol):
>    https://sourceware.org/bugzilla/show_bug.cgi?id=30690
> 
> Relevant GCC bugzillas (all now resolved as fixed):
> 
> * Make use of signed-load instructions:
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110782
>    
> * Make use of signed division/modulus:
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110783
> 
> * Make use of signed mov instructions:
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110784
> 
> * Make use of byte swap instructions:
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110786
> 
> Salud!
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 16:59 ` Yonghong Song
@ 2023-07-28 17:40   ` Jose E. Marchesi
  2023-07-28 18:01     ` Jose E. Marchesi
  0 siblings, 1 reply; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-28 17:40 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf


> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
>> Hello.
>> Just a heads up regarding the new BPF V4 instructions and their
>> support
>> in the GNU Toolchain.
>> V4 sdiv/smod instructions
>>    Binutils has been updated to use the V4 encoding of these
>>    instructions, which used to be part of the xbpf testing dialect used
>>    in GCC.  GCC generates these instructions for signed division when
>>    -mcpu=v4 or higher.
>> V4 sign-extending register move instructions
>> V4 signed load instructions
>> V4 byte swap instructions
>>    Supported in assembler, disassembler and linker.  GCC generates
>> these
>>    instructions when -mcpu=v4 or higher.
>> V4 32-bit unconditional jump instruction
>>    Supported in assembler and disassembler.  GCC doesn't generate
>> that
>>    instruction.
>>    However, the assembler has been expanded in order to perform the
>>    following relaxations when the disp16 field of a jump instruction is
>>    known at assembly time, and is overflown, unless -mno-relax is
>>    specified:
>>      JA disp16  -> JAL disp32
>>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
>>    Where Jxx is one of the conditional jump instructions such as
>> jeq,
>>    jlt, etc.
>
> Sounds great. The above 'JA/Jxx disp16' transformation matches
> what llvm did as well.

Not by chance ;)

Now what is pending in binutils is to relax these jumps in the linker as
well.  But it is very low priority, compared to get these kernel
selftests building and running.  So it will happen, but probably not
anytime soon.

>
>> So I think we are done with this.  Please let us know if these
>> instructions ever change.
>> Relevant binutils bugzillas (all now resolved as fixed):
>> * Make use of long range calls by relaxation (jal/gotol):
>>    https://sourceware.org/bugzilla/show_bug.cgi?id=30690
>> Relevant GCC bugzillas (all now resolved as fixed):
>> * Make use of signed-load instructions:
>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110782
>>    * Make use of signed division/modulus:
>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110783
>> * Make use of signed mov instructions:
>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110784
>> * Make use of byte swap instructions:
>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110786
>> Salud!
>> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 17:40   ` Jose E. Marchesi
@ 2023-07-28 18:01     ` Jose E. Marchesi
  2023-07-28 23:49       ` Alexei Starovoitov
  0 siblings, 1 reply; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-28 18:01 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf


>> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
>>> Hello.
>>> Just a heads up regarding the new BPF V4 instructions and their
>>> support
>>> in the GNU Toolchain.
>>> V4 sdiv/smod instructions
>>>    Binutils has been updated to use the V4 encoding of these
>>>    instructions, which used to be part of the xbpf testing dialect used
>>>    in GCC.  GCC generates these instructions for signed division when
>>>    -mcpu=v4 or higher.
>>> V4 sign-extending register move instructions
>>> V4 signed load instructions
>>> V4 byte swap instructions
>>>    Supported in assembler, disassembler and linker.  GCC generates
>>> these
>>>    instructions when -mcpu=v4 or higher.
>>> V4 32-bit unconditional jump instruction
>>>    Supported in assembler and disassembler.  GCC doesn't generate
>>> that
>>>    instruction.
>>>    However, the assembler has been expanded in order to perform the
>>>    following relaxations when the disp16 field of a jump instruction is
>>>    known at assembly time, and is overflown, unless -mno-relax is
>>>    specified:
>>>      JA disp16  -> JAL disp32
>>>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
>>>    Where Jxx is one of the conditional jump instructions such as
>>> jeq,
>>>    jlt, etc.
>>
>> Sounds great. The above 'JA/Jxx disp16' transformation matches
>> what llvm did as well.
>
> Not by chance ;)
>
> Now what is pending in binutils is to relax these jumps in the linker as
> well.  But it is very low priority, compared to get these kernel
> selftests building and running.  So it will happen, but probably not
> anytime soon.

By the way, for doing things like that (further object transformations
by linkers and the like) we will need to have the ELF files annotated
with:

- The BPF cpu version the object was compiled for: v1, v2, v3, v4, and

- Individual flags specifying the BPF cpu capabilities (alu32, bswap,
  jmp32, etc) required/expected by the code in the object.

Note it is interesting to being able to denote both, for flexibility.

There are 32 bits available for machine-specific flags in e_flags, which
are commonly used for this purpose by other arches.  For BPF I would
suggest something like:

#define EF_BPF_ALU32  0x00000001
#define EF_BPF_JMP32  0x00000002
#define EF_BPF_BSWAP  0x00000004
#define EF_BPF_SDIV   0x00000008
#define EF_BPF_CPUVER 0x00FF0000

>>
>>> So I think we are done with this.  Please let us know if these
>>> instructions ever change.
>>> Relevant binutils bugzillas (all now resolved as fixed):
>>> * Make use of long range calls by relaxation (jal/gotol):
>>>    https://sourceware.org/bugzilla/show_bug.cgi?id=30690
>>> Relevant GCC bugzillas (all now resolved as fixed):
>>> * Make use of signed-load instructions:
>>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110782
>>>    * Make use of signed division/modulus:
>>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110783
>>> * Make use of signed mov instructions:
>>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110784
>>> * Make use of byte swap instructions:
>>>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110786
>>> Salud!
>>> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 18:01     ` Jose E. Marchesi
@ 2023-07-28 23:49       ` Alexei Starovoitov
  2023-07-29  8:29         ` Jose E. Marchesi
  0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-07-28 23:49 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: Yonghong Song, bpf

On Fri, Jul 28, 2023 at 11:01 AM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> >> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
> >>> Hello.
> >>> Just a heads up regarding the new BPF V4 instructions and their
> >>> support
> >>> in the GNU Toolchain.
> >>> V4 sdiv/smod instructions
> >>>    Binutils has been updated to use the V4 encoding of these
> >>>    instructions, which used to be part of the xbpf testing dialect used
> >>>    in GCC.  GCC generates these instructions for signed division when
> >>>    -mcpu=v4 or higher.
> >>> V4 sign-extending register move instructions
> >>> V4 signed load instructions
> >>> V4 byte swap instructions
> >>>    Supported in assembler, disassembler and linker.  GCC generates
> >>> these
> >>>    instructions when -mcpu=v4 or higher.
> >>> V4 32-bit unconditional jump instruction
> >>>    Supported in assembler and disassembler.  GCC doesn't generate
> >>> that
> >>>    instruction.
> >>>    However, the assembler has been expanded in order to perform the
> >>>    following relaxations when the disp16 field of a jump instruction is
> >>>    known at assembly time, and is overflown, unless -mno-relax is
> >>>    specified:
> >>>      JA disp16  -> JAL disp32
> >>>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
> >>>    Where Jxx is one of the conditional jump instructions such as
> >>> jeq,
> >>>    jlt, etc.
> >>
> >> Sounds great. The above 'JA/Jxx disp16' transformation matches
> >> what llvm did as well.
> >
> > Not by chance ;)
> >
> > Now what is pending in binutils is to relax these jumps in the linker as
> > well.  But it is very low priority, compared to get these kernel
> > selftests building and running.  So it will happen, but probably not
> > anytime soon.
>
> By the way, for doing things like that (further object transformations
> by linkers and the like) we will need to have the ELF files annotated
> with:
>
> - The BPF cpu version the object was compiled for: v1, v2, v3, v4, and
>
> - Individual flags specifying the BPF cpu capabilities (alu32, bswap,
>   jmp32, etc) required/expected by the code in the object.
>
> Note it is interesting to being able to denote both, for flexibility.
>
> There are 32 bits available for machine-specific flags in e_flags, which
> are commonly used for this purpose by other arches.  For BPF I would
> suggest something like:
>
> #define EF_BPF_ALU32  0x00000001
> #define EF_BPF_JMP32  0x00000002
> #define EF_BPF_BSWAP  0x00000004
> #define EF_BPF_SDIV   0x00000008
> #define EF_BPF_CPUVER 0x00FF0000

Interesting idea. I don't mind, but what are we going to do with this info?
I cannot think of anything useful libbpf could do with it.
For other archs such flags make sense, since disasm of everything
to discover properties is hard. For BPF we will parse all insns anyway,
so additional info in ELF doesn't give any additional insight.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-28 23:49       ` Alexei Starovoitov
@ 2023-07-29  8:29         ` Jose E. Marchesi
  2023-07-29 17:56           ` Alexei Starovoitov
  0 siblings, 1 reply; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-29  8:29 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Yonghong Song, bpf


> On Fri, Jul 28, 2023 at 11:01 AM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>>
>> >> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
>> >>> Hello.
>> >>> Just a heads up regarding the new BPF V4 instructions and their
>> >>> support
>> >>> in the GNU Toolchain.
>> >>> V4 sdiv/smod instructions
>> >>>    Binutils has been updated to use the V4 encoding of these
>> >>>    instructions, which used to be part of the xbpf testing dialect used
>> >>>    in GCC.  GCC generates these instructions for signed division when
>> >>>    -mcpu=v4 or higher.
>> >>> V4 sign-extending register move instructions
>> >>> V4 signed load instructions
>> >>> V4 byte swap instructions
>> >>>    Supported in assembler, disassembler and linker.  GCC generates
>> >>> these
>> >>>    instructions when -mcpu=v4 or higher.
>> >>> V4 32-bit unconditional jump instruction
>> >>>    Supported in assembler and disassembler.  GCC doesn't generate
>> >>> that
>> >>>    instruction.
>> >>>    However, the assembler has been expanded in order to perform the
>> >>>    following relaxations when the disp16 field of a jump instruction is
>> >>>    known at assembly time, and is overflown, unless -mno-relax is
>> >>>    specified:
>> >>>      JA disp16  -> JAL disp32
>> >>>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
>> >>>    Where Jxx is one of the conditional jump instructions such as
>> >>> jeq,
>> >>>    jlt, etc.
>> >>
>> >> Sounds great. The above 'JA/Jxx disp16' transformation matches
>> >> what llvm did as well.
>> >
>> > Not by chance ;)
>> >
>> > Now what is pending in binutils is to relax these jumps in the linker as
>> > well.  But it is very low priority, compared to get these kernel
>> > selftests building and running.  So it will happen, but probably not
>> > anytime soon.
>>
>> By the way, for doing things like that (further object transformations
>> by linkers and the like) we will need to have the ELF files annotated
>> with:
>>
>> - The BPF cpu version the object was compiled for: v1, v2, v3, v4, and
>>
>> - Individual flags specifying the BPF cpu capabilities (alu32, bswap,
>>   jmp32, etc) required/expected by the code in the object.
>>
>> Note it is interesting to being able to denote both, for flexibility.
>>
>> There are 32 bits available for machine-specific flags in e_flags, which
>> are commonly used for this purpose by other arches.  For BPF I would
>> suggest something like:
>>
>> #define EF_BPF_ALU32  0x00000001
>> #define EF_BPF_JMP32  0x00000002
>> #define EF_BPF_BSWAP  0x00000004
>> #define EF_BPF_SDIV   0x00000008
>> #define EF_BPF_CPUVER 0x00FF0000
>
> Interesting idea. I don't mind, but what are we going to do with this info?
> I cannot think of anything useful libbpf could do with it.
> For other archs such flags make sense, since disasm of everything
> to discover properties is hard. For BPF we will parse all insns anyway,
> so additional info in ELF doesn't give any additional insight.

I mainly had link-time relaxation in mind.  The linker needs to know
what instructions are available (JMP32 or not) in order to decide what
to relax, and to what.

Also as you mention the disassembler can look in the object to determine
which instructions shall be recognized and with insructions shall be
reported as <unknown>.  Right now it is necessary to pass an explicit
option to the assembler, and the default is v4.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-29  8:29         ` Jose E. Marchesi
@ 2023-07-29 17:56           ` Alexei Starovoitov
  2023-07-30  4:54             ` Jose E. Marchesi
  0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-07-29 17:56 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: Yonghong Song, bpf

On Sat, Jul 29, 2023 at 1:29 AM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> > On Fri, Jul 28, 2023 at 11:01 AM Jose E. Marchesi
> > <jose.marchesi@oracle.com> wrote:
> >>
> >>
> >> >> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
> >> >>> Hello.
> >> >>> Just a heads up regarding the new BPF V4 instructions and their
> >> >>> support
> >> >>> in the GNU Toolchain.
> >> >>> V4 sdiv/smod instructions
> >> >>>    Binutils has been updated to use the V4 encoding of these
> >> >>>    instructions, which used to be part of the xbpf testing dialect used
> >> >>>    in GCC.  GCC generates these instructions for signed division when
> >> >>>    -mcpu=v4 or higher.
> >> >>> V4 sign-extending register move instructions
> >> >>> V4 signed load instructions
> >> >>> V4 byte swap instructions
> >> >>>    Supported in assembler, disassembler and linker.  GCC generates
> >> >>> these
> >> >>>    instructions when -mcpu=v4 or higher.
> >> >>> V4 32-bit unconditional jump instruction
> >> >>>    Supported in assembler and disassembler.  GCC doesn't generate
> >> >>> that
> >> >>>    instruction.
> >> >>>    However, the assembler has been expanded in order to perform the
> >> >>>    following relaxations when the disp16 field of a jump instruction is
> >> >>>    known at assembly time, and is overflown, unless -mno-relax is
> >> >>>    specified:
> >> >>>      JA disp16  -> JAL disp32
> >> >>>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
> >> >>>    Where Jxx is one of the conditional jump instructions such as
> >> >>> jeq,
> >> >>>    jlt, etc.
> >> >>
> >> >> Sounds great. The above 'JA/Jxx disp16' transformation matches
> >> >> what llvm did as well.
> >> >
> >> > Not by chance ;)
> >> >
> >> > Now what is pending in binutils is to relax these jumps in the linker as
> >> > well.  But it is very low priority, compared to get these kernel
> >> > selftests building and running.  So it will happen, but probably not
> >> > anytime soon.
> >>
> >> By the way, for doing things like that (further object transformations
> >> by linkers and the like) we will need to have the ELF files annotated
> >> with:
> >>
> >> - The BPF cpu version the object was compiled for: v1, v2, v3, v4, and
> >>
> >> - Individual flags specifying the BPF cpu capabilities (alu32, bswap,
> >>   jmp32, etc) required/expected by the code in the object.
> >>
> >> Note it is interesting to being able to denote both, for flexibility.
> >>
> >> There are 32 bits available for machine-specific flags in e_flags, which
> >> are commonly used for this purpose by other arches.  For BPF I would
> >> suggest something like:
> >>
> >> #define EF_BPF_ALU32  0x00000001
> >> #define EF_BPF_JMP32  0x00000002
> >> #define EF_BPF_BSWAP  0x00000004
> >> #define EF_BPF_SDIV   0x00000008
> >> #define EF_BPF_CPUVER 0x00FF0000
> >
> > Interesting idea. I don't mind, but what are we going to do with this info?
> > I cannot think of anything useful libbpf could do with it.
> > For other archs such flags make sense, since disasm of everything
> > to discover properties is hard. For BPF we will parse all insns anyway,
> > so additional info in ELF doesn't give any additional insight.
>
> I mainly had link-time relaxation in mind.  The linker needs to know
> what instructions are available (JMP32 or not) in order to decide what
> to relax, and to what.

But the assembler has little choice when the jump target is >16bits.
It can use jmp32 or error.
I guess you're proposing to encode this e_flags in the text of asm ?
Special asm directive that will force asm to error or use jmp32?

> Also as you mention the disassembler can look in the object to determine
> which instructions shall be recognized and with insructions shall be
> reported as <unknown>.  Right now it is necessary to pass an explicit
> option to the assembler, and the default is v4.

Disambiguating between unknown and exact insn kinda makes sense for disasm.
For assembler it's kinda weird. If text says 'sdiv' the asm should emit
binary code for it regardless of asm directive.
It seems e_flags can only be emitted by assembler.
Like if it needs to use jmp32 it will add EF_BPF_JMP32.

Still feels that we can live without these flags, but not a bad addition.

As far as flag names, let's use EF_ prefix. I think it's more canonical.
And single 0xF is probably enough for cpu ver.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-29 17:56           ` Alexei Starovoitov
@ 2023-07-30  4:54             ` Jose E. Marchesi
  2023-07-30 16:12               ` Yonghong Song
  2023-07-30 16:53               ` Alexei Starovoitov
  0 siblings, 2 replies; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-30  4:54 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Yonghong Song, bpf


> On Sat, Jul 29, 2023 at 1:29 AM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>>
>> > On Fri, Jul 28, 2023 at 11:01 AM Jose E. Marchesi
>> > <jose.marchesi@oracle.com> wrote:
>> >>
>> >>
>> >> >> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
>> >> >>> Hello.
>> >> >>> Just a heads up regarding the new BPF V4 instructions and their
>> >> >>> support
>> >> >>> in the GNU Toolchain.
>> >> >>> V4 sdiv/smod instructions
>> >> >>>    Binutils has been updated to use the V4 encoding of these
>> >> >>>    instructions, which used to be part of the xbpf testing dialect used
>> >> >>>    in GCC.  GCC generates these instructions for signed division when
>> >> >>>    -mcpu=v4 or higher.
>> >> >>> V4 sign-extending register move instructions
>> >> >>> V4 signed load instructions
>> >> >>> V4 byte swap instructions
>> >> >>>    Supported in assembler, disassembler and linker.  GCC generates
>> >> >>> these
>> >> >>>    instructions when -mcpu=v4 or higher.
>> >> >>> V4 32-bit unconditional jump instruction
>> >> >>>    Supported in assembler and disassembler.  GCC doesn't generate
>> >> >>> that
>> >> >>>    instruction.
>> >> >>>    However, the assembler has been expanded in order to perform the
>> >> >>>    following relaxations when the disp16 field of a jump instruction is
>> >> >>>    known at assembly time, and is overflown, unless -mno-relax is
>> >> >>>    specified:
>> >> >>>      JA disp16  -> JAL disp32
>> >> >>>      Jxx disp16 -> Jxx +1; JA +1; JAL disp32
>> >> >>>    Where Jxx is one of the conditional jump instructions such as
>> >> >>> jeq,
>> >> >>>    jlt, etc.
>> >> >>
>> >> >> Sounds great. The above 'JA/Jxx disp16' transformation matches
>> >> >> what llvm did as well.
>> >> >
>> >> > Not by chance ;)
>> >> >
>> >> > Now what is pending in binutils is to relax these jumps in the linker as
>> >> > well.  But it is very low priority, compared to get these kernel
>> >> > selftests building and running.  So it will happen, but probably not
>> >> > anytime soon.
>> >>
>> >> By the way, for doing things like that (further object transformations
>> >> by linkers and the like) we will need to have the ELF files annotated
>> >> with:
>> >>
>> >> - The BPF cpu version the object was compiled for: v1, v2, v3, v4, and
>> >>
>> >> - Individual flags specifying the BPF cpu capabilities (alu32, bswap,
>> >>   jmp32, etc) required/expected by the code in the object.
>> >>
>> >> Note it is interesting to being able to denote both, for flexibility.
>> >>
>> >> There are 32 bits available for machine-specific flags in e_flags, which
>> >> are commonly used for this purpose by other arches.  For BPF I would
>> >> suggest something like:
>> >>
>> >> #define EF_BPF_ALU32  0x00000001
>> >> #define EF_BPF_JMP32  0x00000002
>> >> #define EF_BPF_BSWAP  0x00000004
>> >> #define EF_BPF_SDIV   0x00000008
>> >> #define EF_BPF_CPUVER 0x00FF0000
>> >
>> > Interesting idea. I don't mind, but what are we going to do with this info?
>> > I cannot think of anything useful libbpf could do with it.
>> > For other archs such flags make sense, since disasm of everything
>> > to discover properties is hard. For BPF we will parse all insns anyway,
>> > so additional info in ELF doesn't give any additional insight.
>>
>> I mainly had link-time relaxation in mind.  The linker needs to know
>> what instructions are available (JMP32 or not) in order to decide what
>> to relax, and to what.
>
> But the assembler has little choice when the jump target is >16bits.
> It can use jmp32 or error.

When the assembler sees a jump instruction:

   goto EXPR

there are several possibilities:

1. EXPR consists on a literal number like 1, -10 or 0xff, or an
   expression that can be resolved during the first assembler pass (like
   8 * 64).  The numerical result is interpreted as number of 64-bit
   words minus one.  In this case, the assembler can immediately decide
   whether the operand is >16 bits, relaxing to the jmp32 jump if cpu >=
   v4 and unless -mno-relax is passed in the command line.

2. EXPR is a symbolic expression involving a symbol that can be resolved
   during the second assembler pass.  For example, `foo + 10'.  In this
   case, there are two possibilities:

   2.1. The symbol is an absolute symbol.  In this case the value is
        interpreted as-such and no conversion is done by the assembler.
        So if for example the user invokes the assembler passing
        `--defsym foo=10', the assembled instruction is `ja 20'.

   2.2. The symbol is a PC-relative or section-relative symbol.  In this
        case the value is interpreted as a byte offset (the assembler
        takes care to transform offsets relative to the current section
        into PC-relative offsets whenever necessary).  This is the case
        of labels.  For these symbols, the BPF assembler converts the
        value from bytes to number of 64-bit words minus one.  So for
        example for `ja done' where `done' has the value 256 bytes, the
        assembled instruction is `ja 31'.

3. EXPR is a symbolic expression involving a symbol that cannot be
   resolved during the second assembler pass.  In this case, a
   relocation for the 16-bit immediate field in the instruction is
   generated in the assembled object.  There is no R_BPF_64_16
   relocation defined by BPF as of yet, so we are using
   R_BPF_GNU_64_16=256, which as we agreed uses a high relocation number
   to avoid collisions.  Since gas is a standalone assembler, it seems
   sensible to emit a relocation rather than erroing out in these
   situations.  ld knows how to handle these relocs when linking BPF
   objects together.

> I guess you're proposing to encode this e_flags in the text of asm ?
> Special asm directive that will force asm to error or use jmp32?

GAS uses command-line options for that.

When GCC is invoked with -mcpu=v3, for example, it passes the
corresponding option to the assembler so it expects a BPF V3 assembly
program. In that scenario, if the user does a jump to an address that is
>16bit in an inline asm, the assembler will error out,
because relaxing to jmp32 is not a possibility in V3.  Ditto for
compiler options like -msdiv or -mjmp32, that both clang and GCC
support.

I don't know how clang configures its integrated assembler... I guess by
calling some function.  But it is the same principle: if you tell clang
to generate v3 bpf and you include a header that uses a v4 instruction
(or overflown jump that would require relaxation) in inline asm, you
want an error.

>> Also as you mention the disassembler can look in the object to determine
>> which instructions shall be recognized and with insructions shall be
>> reported as <unknown>.  Right now it is necessary to pass an explicit
>> option to the assembler, and the default is v4.
>
> Disambiguating between unknown and exact insn kinda makes sense for disasm.
> For assembler it's kinda weird. If text says 'sdiv' the asm should emit
> binary code for it regardless of asm directive.

Unless configured to not do so?  See above.

> It seems e_flags can only be emitted by assembler.
> Like if it needs to use jmp32 it will add EF_BPF_JMP32.

Yep.

> Still feels that we can live without these flags, but not a bad
> addition.

The individual flags... I am not sure, other arches have them, but maybe
having them in BPF doesn't make much sense and it is not worth the extra
complication and wasted bits in e_flags.  How realistic is to expect
that some kernel may support a particular version of the BPF ISA, and
also have support for some particular instruction from a later ISA as
the result of a backport or something?  Not for me to judge... I was
already bitten by my utter ignorance on kernel business when I added
that silly useless -mkernel=VERSION option to GCC 8-)

What I am pretty sure is that we will need something like EF_BPF_CPUVER
if we are ever gonna support relaxation in any linker external to
libbpf, and also to detect (and error/warn) when several objects with
different BPF versions are linked together.

> As far as flag names, let's use EF_ prefix. I think it's more canonical.
> And single 0xF is probably enough for cpu ver.

Agreed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-30  4:54             ` Jose E. Marchesi
@ 2023-07-30 16:12               ` Yonghong Song
  2023-07-30 19:11                 ` Jose E. Marchesi
  2023-07-30 16:53               ` Alexei Starovoitov
  1 sibling, 1 reply; 15+ messages in thread
From: Yonghong Song @ 2023-07-30 16:12 UTC (permalink / raw)
  To: Jose E. Marchesi, Alexei Starovoitov; +Cc: bpf



On 7/29/23 9:54 PM, Jose E. Marchesi wrote:
> 
>> On Sat, Jul 29, 2023 at 1:29 AM Jose E. Marchesi
>> <jose.marchesi@oracle.com> wrote:
>>>
>>>
>>>> On Fri, Jul 28, 2023 at 11:01 AM Jose E. Marchesi
>>>> <jose.marchesi@oracle.com> wrote:
>>>>>
>>>>>
>>>>>>> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
>>>>>>>> Hello.
>>>>>>>> Just a heads up regarding the new BPF V4 instructions and their
>>>>>>>> support
>>>>>>>> in the GNU Toolchain.
>>>>>>>> V4 sdiv/smod instructions
>>>>>>>>     Binutils has been updated to use the V4 encoding of these
>>>>>>>>     instructions, which used to be part of the xbpf testing dialect used
>>>>>>>>     in GCC.  GCC generates these instructions for signed division when
>>>>>>>>     -mcpu=v4 or higher.
>>>>>>>> V4 sign-extending register move instructions
>>>>>>>> V4 signed load instructions
>>>>>>>> V4 byte swap instructions
>>>>>>>>     Supported in assembler, disassembler and linker.  GCC generates
>>>>>>>> these
>>>>>>>>     instructions when -mcpu=v4 or higher.
>>>>>>>> V4 32-bit unconditional jump instruction
>>>>>>>>     Supported in assembler and disassembler.  GCC doesn't generate
>>>>>>>> that
>>>>>>>>     instruction.
>>>>>>>>     However, the assembler has been expanded in order to perform the
>>>>>>>>     following relaxations when the disp16 field of a jump instruction is
>>>>>>>>     known at assembly time, and is overflown, unless -mno-relax is
>>>>>>>>     specified:
>>>>>>>>       JA disp16  -> JAL disp32
>>>>>>>>       Jxx disp16 -> Jxx +1; JA +1; JAL disp32
>>>>>>>>     Where Jxx is one of the conditional jump instructions such as
>>>>>>>> jeq,
>>>>>>>>     jlt, etc.
>>>>>>>
>>>>>>> Sounds great. The above 'JA/Jxx disp16' transformation matches
>>>>>>> what llvm did as well.
>>>>>>
>>>>>> Not by chance ;)
>>>>>>
>>>>>> Now what is pending in binutils is to relax these jumps in the linker as
>>>>>> well.  But it is very low priority, compared to get these kernel
>>>>>> selftests building and running.  So it will happen, but probably not
>>>>>> anytime soon.
>>>>>
>>>>> By the way, for doing things like that (further object transformations
>>>>> by linkers and the like) we will need to have the ELF files annotated
>>>>> with:
>>>>>
>>>>> - The BPF cpu version the object was compiled for: v1, v2, v3, v4, and
>>>>>
>>>>> - Individual flags specifying the BPF cpu capabilities (alu32, bswap,
>>>>>    jmp32, etc) required/expected by the code in the object.
>>>>>
>>>>> Note it is interesting to being able to denote both, for flexibility.
>>>>>
>>>>> There are 32 bits available for machine-specific flags in e_flags, which
>>>>> are commonly used for this purpose by other arches.  For BPF I would
>>>>> suggest something like:
>>>>>
>>>>> #define EF_BPF_ALU32  0x00000001
>>>>> #define EF_BPF_JMP32  0x00000002
>>>>> #define EF_BPF_BSWAP  0x00000004
>>>>> #define EF_BPF_SDIV   0x00000008
>>>>> #define EF_BPF_CPUVER 0x00FF0000
>>>>
>>>> Interesting idea. I don't mind, but what are we going to do with this info?
>>>> I cannot think of anything useful libbpf could do with it.
>>>> For other archs such flags make sense, since disasm of everything
>>>> to discover properties is hard. For BPF we will parse all insns anyway,
>>>> so additional info in ELF doesn't give any additional insight.
>>>
>>> I mainly had link-time relaxation in mind.  The linker needs to know
>>> what instructions are available (JMP32 or not) in order to decide what
>>> to relax, and to what.
>>
>> But the assembler has little choice when the jump target is >16bits.
>> It can use jmp32 or error.
> 
> When the assembler sees a jump instruction:
> 
>     goto EXPR
> 
> there are several possibilities:
> 
> 1. EXPR consists on a literal number like 1, -10 or 0xff, or an
>     expression that can be resolved during the first assembler pass (like
>     8 * 64).  The numerical result is interpreted as number of 64-bit
>     words minus one.  In this case, the assembler can immediately decide
>     whether the operand is >16 bits, relaxing to the jmp32 jump if cpu >=
>     v4 and unless -mno-relax is passed in the command line.
> 
> 2. EXPR is a symbolic expression involving a symbol that can be resolved
>     during the second assembler pass.  For example, `foo + 10'.  In this
>     case, there are two possibilities:
> 
>     2.1. The symbol is an absolute symbol.  In this case the value is
>          interpreted as-such and no conversion is done by the assembler.
>          So if for example the user invokes the assembler passing
>          `--defsym foo=10', the assembled instruction is `ja 20'.
> 
>     2.2. The symbol is a PC-relative or section-relative symbol.  In this
>          case the value is interpreted as a byte offset (the assembler
>          takes care to transform offsets relative to the current section
>          into PC-relative offsets whenever necessary).  This is the case
>          of labels.  For these symbols, the BPF assembler converts the
>          value from bytes to number of 64-bit words minus one.  So for
>          example for `ja done' where `done' has the value 256 bytes, the
>          assembled instruction is `ja 31'.
> 
> 3. EXPR is a symbolic expression involving a symbol that cannot be
>     resolved during the second assembler pass.  In this case, a
>     relocation for the 16-bit immediate field in the instruction is
>     generated in the assembled object.  There is no R_BPF_64_16
>     relocation defined by BPF as of yet, so we are using
>     R_BPF_GNU_64_16=256, which as we agreed uses a high relocation number
>     to avoid collisions.  Since gas is a standalone assembler, it seems
>     sensible to emit a relocation rather than erroing out in these
>     situations.  ld knows how to handle these relocs when linking BPF
>     objects together.
> 
>> I guess you're proposing to encode this e_flags in the text of asm ?
>> Special asm directive that will force asm to error or use jmp32?
> 
> GAS uses command-line options for that.
> 
> When GCC is invoked with -mcpu=v3, for example, it passes the
> corresponding option to the assembler so it expects a BPF V3 assembly
> program. In that scenario, if the user does a jump to an address that is
>> 16bit in an inline asm, the assembler will error out,
> because relaxing to jmp32 is not a possibility in V3.  Ditto for
> compiler options like -msdiv or -mjmp32, that both clang and GCC
> support.
> 
> I don't know how clang configures its integrated assembler... I guess by
> calling some function.  But it is the same principle: if you tell clang
> to generate v3 bpf and you include a header that uses a v4 instruction
> (or overflown jump that would require relaxation) in inline asm, you
> want an error.

If -mcpu=<version> is specified in the clang command line,
then the cpu <version> will be encoded in IR and will be
passed to the integrated assembler. And if you specify
-mcpu=v3 in the command line and your code has
cpu v4 inline assembly code, the compiler will error out.

> 
>>> Also as you mention the disassembler can look in the object to determine
>>> which instructions shall be recognized and with insructions shall be
>>> reported as <unknown>.  Right now it is necessary to pass an explicit
>>> option to the assembler, and the default is v4.
>>
>> Disambiguating between unknown and exact insn kinda makes sense for disasm.
>> For assembler it's kinda weird. If text says 'sdiv' the asm should emit
>> binary code for it regardless of asm directive.
> 
> Unless configured to not do so?  See above.
> 
>> It seems e_flags can only be emitted by assembler.
>> Like if it needs to use jmp32 it will add EF_BPF_JMP32.
> 
> Yep.
> 
>> Still feels that we can live without these flags, but not a bad
>> addition.
> 
> The individual flags... I am not sure, other arches have them, but maybe
> having them in BPF doesn't make much sense and it is not worth the extra
> complication and wasted bits in e_flags.  How realistic is to expect
> that some kernel may support a particular version of the BPF ISA, and
> also have support for some particular instruction from a later ISA as
> the result of a backport or something?  Not for me to judge... I was
> already bitten by my utter ignorance on kernel business when I added
> that silly useless -mkernel=VERSION option to GCC 8-)
> 
> What I am pretty sure is that we will need something like EF_BPF_CPUVER
> if we are ever gonna support relaxation in any linker external to
> libbpf, and also to detect (and error/warn) when several objects with
> different BPF versions are linked together.
> 
>> As far as flag names, let's use EF_ prefix. I think it's more canonical.
>> And single 0xF is probably enough for cpu ver.
> 
> Agreed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-30 16:12               ` Yonghong Song
@ 2023-07-30 19:11                 ` Jose E. Marchesi
  0 siblings, 0 replies; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-30 19:11 UTC (permalink / raw)
  To: Yonghong Song; +Cc: Alexei Starovoitov, bpf


> On 7/29/23 9:54 PM, Jose E. Marchesi wrote:
>> 
>>> On Sat, Jul 29, 2023 at 1:29 AM Jose E. Marchesi
>>> <jose.marchesi@oracle.com> wrote:
>>>>
>>>>
>>>>> On Fri, Jul 28, 2023 at 11:01 AM Jose E. Marchesi
>>>>> <jose.marchesi@oracle.com> wrote:
>>>>>>
>>>>>>
>>>>>>>> On 7/28/23 9:41 AM, Jose E. Marchesi wrote:
>>>>>>>>> Hello.
>>>>>>>>> Just a heads up regarding the new BPF V4 instructions and their
>>>>>>>>> support
>>>>>>>>> in the GNU Toolchain.
>>>>>>>>> V4 sdiv/smod instructions
>>>>>>>>>     Binutils has been updated to use the V4 encoding of these
>>>>>>>>>     instructions, which used to be part of the xbpf testing dialect used
>>>>>>>>>     in GCC.  GCC generates these instructions for signed division when
>>>>>>>>>     -mcpu=v4 or higher.
>>>>>>>>> V4 sign-extending register move instructions
>>>>>>>>> V4 signed load instructions
>>>>>>>>> V4 byte swap instructions
>>>>>>>>>     Supported in assembler, disassembler and linker.  GCC generates
>>>>>>>>> these
>>>>>>>>>     instructions when -mcpu=v4 or higher.
>>>>>>>>> V4 32-bit unconditional jump instruction
>>>>>>>>>     Supported in assembler and disassembler.  GCC doesn't generate
>>>>>>>>> that
>>>>>>>>>     instruction.
>>>>>>>>>     However, the assembler has been expanded in order to perform the
>>>>>>>>>     following relaxations when the disp16 field of a jump instruction is
>>>>>>>>>     known at assembly time, and is overflown, unless -mno-relax is
>>>>>>>>>     specified:
>>>>>>>>>       JA disp16  -> JAL disp32
>>>>>>>>>       Jxx disp16 -> Jxx +1; JA +1; JAL disp32
>>>>>>>>>     Where Jxx is one of the conditional jump instructions such as
>>>>>>>>> jeq,
>>>>>>>>>     jlt, etc.
>>>>>>>>
>>>>>>>> Sounds great. The above 'JA/Jxx disp16' transformation matches
>>>>>>>> what llvm did as well.
>>>>>>>
>>>>>>> Not by chance ;)
>>>>>>>
>>>>>>> Now what is pending in binutils is to relax these jumps in the linker as
>>>>>>> well.  But it is very low priority, compared to get these kernel
>>>>>>> selftests building and running.  So it will happen, but probably not
>>>>>>> anytime soon.
>>>>>>
>>>>>> By the way, for doing things like that (further object transformations
>>>>>> by linkers and the like) we will need to have the ELF files annotated
>>>>>> with:
>>>>>>
>>>>>> - The BPF cpu version the object was compiled for: v1, v2, v3, v4, and
>>>>>>
>>>>>> - Individual flags specifying the BPF cpu capabilities (alu32, bswap,
>>>>>>    jmp32, etc) required/expected by the code in the object.
>>>>>>
>>>>>> Note it is interesting to being able to denote both, for flexibility.
>>>>>>
>>>>>> There are 32 bits available for machine-specific flags in e_flags, which
>>>>>> are commonly used for this purpose by other arches.  For BPF I would
>>>>>> suggest something like:
>>>>>>
>>>>>> #define EF_BPF_ALU32  0x00000001
>>>>>> #define EF_BPF_JMP32  0x00000002
>>>>>> #define EF_BPF_BSWAP  0x00000004
>>>>>> #define EF_BPF_SDIV   0x00000008
>>>>>> #define EF_BPF_CPUVER 0x00FF0000
>>>>>
>>>>> Interesting idea. I don't mind, but what are we going to do with this info?
>>>>> I cannot think of anything useful libbpf could do with it.
>>>>> For other archs such flags make sense, since disasm of everything
>>>>> to discover properties is hard. For BPF we will parse all insns anyway,
>>>>> so additional info in ELF doesn't give any additional insight.
>>>>
>>>> I mainly had link-time relaxation in mind.  The linker needs to know
>>>> what instructions are available (JMP32 or not) in order to decide what
>>>> to relax, and to what.
>>>
>>> But the assembler has little choice when the jump target is >16bits.
>>> It can use jmp32 or error.
>> When the assembler sees a jump instruction:
>>     goto EXPR
>> there are several possibilities:
>> 1. EXPR consists on a literal number like 1, -10 or 0xff, or an
>>     expression that can be resolved during the first assembler pass (like
>>     8 * 64).  The numerical result is interpreted as number of 64-bit
>>     words minus one.  In this case, the assembler can immediately decide
>>     whether the operand is >16 bits, relaxing to the jmp32 jump if cpu >=
>>     v4 and unless -mno-relax is passed in the command line.
>> 2. EXPR is a symbolic expression involving a symbol that can be
>> resolved
>>     during the second assembler pass.  For example, `foo + 10'.  In this
>>     case, there are two possibilities:
>>     2.1. The symbol is an absolute symbol.  In this case the value
>> is
>>          interpreted as-such and no conversion is done by the assembler.
>>          So if for example the user invokes the assembler passing
>>          `--defsym foo=10', the assembled instruction is `ja 20'.
>>     2.2. The symbol is a PC-relative or section-relative symbol.  In
>> this
>>          case the value is interpreted as a byte offset (the assembler
>>          takes care to transform offsets relative to the current section
>>          into PC-relative offsets whenever necessary).  This is the case
>>          of labels.  For these symbols, the BPF assembler converts the
>>          value from bytes to number of 64-bit words minus one.  So for
>>          example for `ja done' where `done' has the value 256 bytes, the
>>          assembled instruction is `ja 31'.
>> 3. EXPR is a symbolic expression involving a symbol that cannot be
>>     resolved during the second assembler pass.  In this case, a
>>     relocation for the 16-bit immediate field in the instruction is
>>     generated in the assembled object.  There is no R_BPF_64_16
>>     relocation defined by BPF as of yet, so we are using
>>     R_BPF_GNU_64_16=256, which as we agreed uses a high relocation number
>>     to avoid collisions.  Since gas is a standalone assembler, it seems
>>     sensible to emit a relocation rather than erroing out in these
>>     situations.  ld knows how to handle these relocs when linking BPF
>>     objects together.
>> 
>>> I guess you're proposing to encode this e_flags in the text of asm ?
>>> Special asm directive that will force asm to error or use jmp32?
>> GAS uses command-line options for that.
>> When GCC is invoked with -mcpu=v3, for example, it passes the
>> corresponding option to the assembler so it expects a BPF V3 assembly
>> program. In that scenario, if the user does a jump to an address that is
>>> 16bit in an inline asm, the assembler will error out,
>> because relaxing to jmp32 is not a possibility in V3.  Ditto for
>> compiler options like -msdiv or -mjmp32, that both clang and GCC
>> support.
>> I don't know how clang configures its integrated assembler... I
>> guess by
>> calling some function.  But it is the same principle: if you tell clang
>> to generate v3 bpf and you include a header that uses a v4 instruction
>> (or overflown jump that would require relaxation) in inline asm, you
>> want an error.
>
> If -mcpu=<version> is specified in the clang command line,
> then the cpu <version> will be encoded in IR and will be
> passed to the integrated assembler. And if you specify
> -mcpu=v3 in the command line and your code has
> cpu v4 inline assembly code, the compiler will error out.

Perfect :)
Thanks for the confirmation.

>> 
>>>> Also as you mention the disassembler can look in the object to determine
>>>> which instructions shall be recognized and with insructions shall be
>>>> reported as <unknown>.  Right now it is necessary to pass an explicit
>>>> option to the assembler, and the default is v4.
>>>
>>> Disambiguating between unknown and exact insn kinda makes sense for disasm.
>>> For assembler it's kinda weird. If text says 'sdiv' the asm should emit
>>> binary code for it regardless of asm directive.
>> Unless configured to not do so?  See above.
>> 
>>> It seems e_flags can only be emitted by assembler.
>>> Like if it needs to use jmp32 it will add EF_BPF_JMP32.
>> Yep.
>> 
>>> Still feels that we can live without these flags, but not a bad
>>> addition.
>> The individual flags... I am not sure, other arches have them, but
>> maybe
>> having them in BPF doesn't make much sense and it is not worth the extra
>> complication and wasted bits in e_flags.  How realistic is to expect
>> that some kernel may support a particular version of the BPF ISA, and
>> also have support for some particular instruction from a later ISA as
>> the result of a backport or something?  Not for me to judge... I was
>> already bitten by my utter ignorance on kernel business when I added
>> that silly useless -mkernel=VERSION option to GCC 8-)
>> What I am pretty sure is that we will need something like
>> EF_BPF_CPUVER
>> if we are ever gonna support relaxation in any linker external to
>> libbpf, and also to detect (and error/warn) when several objects with
>> different BPF versions are linked together.
>> 
>>> As far as flag names, let's use EF_ prefix. I think it's more canonical.
>>> And single 0xF is probably enough for cpu ver.
>> Agreed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-30  4:54             ` Jose E. Marchesi
  2023-07-30 16:12               ` Yonghong Song
@ 2023-07-30 16:53               ` Alexei Starovoitov
  2023-07-30 21:06                 ` Jose E. Marchesi
  1 sibling, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2023-07-30 16:53 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: Yonghong Song, bpf

On Sat, Jul 29, 2023 at 9:54 PM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
> The individual flags... I am not sure, other arches have them, but maybe
> having them in BPF doesn't make much sense and it is not worth the extra
> complication and wasted bits in e_flags.  How realistic is to expect
> that some kernel may support a particular version of the BPF ISA, and
> also have support for some particular instruction from a later ISA as
> the result of a backport or something?  Not for me to judge... I was
> already bitten by my utter ignorance on kernel business when I added
> that silly useless -mkernel=VERSION option to GCC 8-)
>
> What I am pretty sure is that we will need something like EF_BPF_CPUVER
> if we are ever gonna support relaxation in any linker external to
> libbpf, and also to detect (and error/warn) when several objects with
> different BPF versions are linked together.

Ok. Let's start with EF_BPF_CPUVER 0xF
and not waste bits on individual instructions, as you said.
When kernel backports are done the patches are sent together.
It wouldn't be wise to backport SDIV without JMP32, for example.
git history will get screwed up and further backports will be a pain.
The risk of untested combinations increases, etc.
I think it's safe to assume that a given kernel will support either v3 or v4.
The kernel version doesn't matter, of course :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-30 16:53               ` Alexei Starovoitov
@ 2023-07-30 21:06                 ` Jose E. Marchesi
  2023-07-31 18:10                   ` Alexei Starovoitov
  0 siblings, 1 reply; 15+ messages in thread
From: Jose E. Marchesi @ 2023-07-30 21:06 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Yonghong Song, bpf


> On Sat, Jul 29, 2023 at 9:54 PM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>> The individual flags... I am not sure, other arches have them, but maybe
>> having them in BPF doesn't make much sense and it is not worth the extra
>> complication and wasted bits in e_flags.  How realistic is to expect
>> that some kernel may support a particular version of the BPF ISA, and
>> also have support for some particular instruction from a later ISA as
>> the result of a backport or something?  Not for me to judge... I was
>> already bitten by my utter ignorance on kernel business when I added
>> that silly useless -mkernel=VERSION option to GCC 8-)
>>
>> What I am pretty sure is that we will need something like EF_BPF_CPUVER
>> if we are ever gonna support relaxation in any linker external to
>> libbpf, and also to detect (and error/warn) when several objects with
>> different BPF versions are linked together.
>
> Ok. Let's start with EF_BPF_CPUVER 0xF
> and not waste bits on individual instructions, as you said.
> When kernel backports are done the patches are sent together.
> It wouldn't be wise to backport SDIV without JMP32, for example.
> git history will get screwed up and further backports will be a pain.
> The risk of untested combinations increases, etc.
> I think it's safe to assume that a given kernel will support either v3
> or v4.

This is good to know.  Thanks for explaining.

> The kernel version doesn't matter, of course :)

Yeah GCC no longer supports -mkernel :P

Allright, so I just pushed a binutils patch for elf.h, the disassembler,
the assembler and readelf:

  https://sourceware.org/pipermail/binutils/2023-July/128723.html

Note that the ISA version selection logic in the disassembler is:

1. If the user specifies an explicit version (v1, v2, v3, v4) then use
   it.

2. Otherwise, use the EF_BPF_CPUVER bits in the ELF header to derive the
   version to use:

   2.1. If the CPUVER is zero, then use the latest supported version
        (currently v4).  This is for backwards compability.

   2.2. Else, if CPUVER is one of the supported versions by the
        disassembler (currently 1, 2, 3 or 4) then use it.

   2.3. Else, emit an error "unknown BPF CPU version %d".

Maybe 2.3 should be a warning instead of an error...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GCC and binutils support for BPF V4 instructions
  2023-07-30 21:06                 ` Jose E. Marchesi
@ 2023-07-31 18:10                   ` Alexei Starovoitov
  0 siblings, 0 replies; 15+ messages in thread
From: Alexei Starovoitov @ 2023-07-31 18:10 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: Yonghong Song, bpf

On Sun, Jul 30, 2023 at 2:06 PM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> > On Sat, Jul 29, 2023 at 9:54 PM Jose E. Marchesi
> > <jose.marchesi@oracle.com> wrote:
> >>
> >> The individual flags... I am not sure, other arches have them, but maybe
> >> having them in BPF doesn't make much sense and it is not worth the extra
> >> complication and wasted bits in e_flags.  How realistic is to expect
> >> that some kernel may support a particular version of the BPF ISA, and
> >> also have support for some particular instruction from a later ISA as
> >> the result of a backport or something?  Not for me to judge... I was
> >> already bitten by my utter ignorance on kernel business when I added
> >> that silly useless -mkernel=VERSION option to GCC 8-)
> >>
> >> What I am pretty sure is that we will need something like EF_BPF_CPUVER
> >> if we are ever gonna support relaxation in any linker external to
> >> libbpf, and also to detect (and error/warn) when several objects with
> >> different BPF versions are linked together.
> >
> > Ok. Let's start with EF_BPF_CPUVER 0xF
> > and not waste bits on individual instructions, as you said.
> > When kernel backports are done the patches are sent together.
> > It wouldn't be wise to backport SDIV without JMP32, for example.
> > git history will get screwed up and further backports will be a pain.
> > The risk of untested combinations increases, etc.
> > I think it's safe to assume that a given kernel will support either v3
> > or v4.
>
> This is good to know.  Thanks for explaining.
>
> > The kernel version doesn't matter, of course :)
>
> Yeah GCC no longer supports -mkernel :P
>
> Allright, so I just pushed a binutils patch for elf.h, the disassembler,
> the assembler and readelf:
>
>   https://sourceware.org/pipermail/binutils/2023-July/128723.html
>
> Note that the ISA version selection logic in the disassembler is:
>
> 1. If the user specifies an explicit version (v1, v2, v3, v4) then use
>    it.
>
> 2. Otherwise, use the EF_BPF_CPUVER bits in the ELF header to derive the
>    version to use:
>
>    2.1. If the CPUVER is zero, then use the latest supported version
>         (currently v4).  This is for backwards compability.
>
>    2.2. Else, if CPUVER is one of the supported versions by the
>         disassembler (currently 1, 2, 3 or 4) then use it.
>
>    2.3. Else, emit an error "unknown BPF CPU version %d".
>
> Maybe 2.3 should be a warning instead of an error...

Warn is probably better.
Older disasm should still print what it knows about from newer ELF.
Unknown insns from cpu=v5 will be 'unknown'. That's better than no
output at all.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-07-31 18:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-28 16:41 GCC and binutils support for BPF V4 instructions Jose E. Marchesi
2023-07-28 16:47 ` Alexei Starovoitov
2023-07-28 17:06   ` Jose E. Marchesi
2023-07-28 16:59 ` Yonghong Song
2023-07-28 17:40   ` Jose E. Marchesi
2023-07-28 18:01     ` Jose E. Marchesi
2023-07-28 23:49       ` Alexei Starovoitov
2023-07-29  8:29         ` Jose E. Marchesi
2023-07-29 17:56           ` Alexei Starovoitov
2023-07-30  4:54             ` Jose E. Marchesi
2023-07-30 16:12               ` Yonghong Song
2023-07-30 19:11                 ` Jose E. Marchesi
2023-07-30 16:53               ` Alexei Starovoitov
2023-07-30 21:06                 ` Jose E. Marchesi
2023-07-31 18:10                   ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox