Masks and overflow of signed immediates in BPF instructions

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* Masks and overflow of signed immediates in BPF instructions
@ 2023-08-15 14:19 Jose E. Marchesi
  2023-08-15 16:12 ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Jose E. Marchesi @ 2023-08-15 14:19 UTC (permalink / raw)
  To: bpf; +Cc: david.faust, cupertino.miranda

Hello.

The selftest progs/verifier_masking.c contains inline assembly code
like:

  	w1 = 0xffffffff;

The 32-bit immediate of that instruction is signed.  Therefore, GAS
complains that the above instruction overflows its field:

  /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall fit in 32 bits

The llvm assembler is likely relying on signed overflow for the above to
work.

Using negative numbers to denote masks is ugly and obfuscating (for
non-obvious cases like -1/0xffffffff) so I suggest we introduce a
pseudo-op so we can do:

   w1 = %mask(0xffffffff)

allowing the assembler to do the right thing (TM) converting and
checking that the mask is valid and not relying on UB.

Thoughts?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-15 14:19 Masks and overflow of signed immediates in BPF instructions Jose E. Marchesi
@ 2023-08-15 16:12 ` Yonghong Song
  2023-08-15 17:01   ` Jose E. Marchesi
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2023-08-15 16:12 UTC (permalink / raw)
  To: Jose E. Marchesi, bpf; +Cc: david.faust, cupertino.miranda



On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
> 
> Hello.
> 
> The selftest progs/verifier_masking.c contains inline assembly code
> like:
> 
>    	w1 = 0xffffffff;
> 
> The 32-bit immediate of that instruction is signed.  Therefore, GAS
> complains that the above instruction overflows its field:
> 
>    /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall fit in 32 bits
> 
> The llvm assembler is likely relying on signed overflow for the above to
> work.

Not really.

   def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
                    (outs GPR32:$dst),
                    (ins GPR32:$src2, i32imm:$imm),
                    "$dst "#OpcodeStr#" $imm",
                    [(set GPR32:$dst, (OpNode GPR32:$src2, 
i32immSExt32:$imm))]>;


If generating from source, the pattern
    [(set GPR32:$dst, (OpNode GPR32:$src2, i32immSExt32:$imm))]
so value 0xffffffff is not SExt32 and it won't match and
eventually a LDimm_64 insn will be generated.

But for inline asm, we will have
   (outs GPR32:$dst)
   (ins GPR32:$src2, i32imm:$imm)

and i32imm is defined as
   def i32imm : Operand<i32>;
which is a unsigned 32bit value, so it is recognized properly
and the insn is encoded properly.

> 
> Using negative numbers to denote masks is ugly and obfuscating (for
> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
> pseudo-op so we can do:
> 
>     w1 = %mask(0xffffffff)

I changed above
   w1 = 0xffffffff;
to
   w1 = %mask(0xffffffff)
and hit the following compilation failure.

progs/verifier_masking.c:54:9: error: invalid % escape in inline 
assembly string
    53 |         asm volatile ("                                 \
       |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    54 |         w1 = %mask(0xffffffff);                         \
       |                ^
1 error generated.

Do you have documentation what is '%mask' thing?

> 
> allowing the assembler to do the right thing (TM) converting and
> checking that the mask is valid and not relying on UB.
> 
> Thoughts?
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-15 16:12 ` Yonghong Song
@ 2023-08-15 17:01   ` Jose E. Marchesi
  2023-08-15 17:28     ` Yonghong Song
  2023-08-16  9:36     ` Jose E. Marchesi
  0 siblings, 2 replies; 12+ messages in thread
From: Jose E. Marchesi @ 2023-08-15 17:01 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, david.faust, cupertino.miranda


> On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
>> Hello.
>> The selftest progs/verifier_masking.c contains inline assembly code
>> like:
>>    	w1 = 0xffffffff;
>> The 32-bit immediate of that instruction is signed.  Therefore, GAS
>> complains that the above instruction overflows its field:
>>    /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall
>> fit in 32 bits
>> The llvm assembler is likely relying on signed overflow for the
>> above to
>> work.
>
> Not really.
>
>   def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
>                    (outs GPR32:$dst),
>                    (ins GPR32:$src2, i32imm:$imm),
>                    "$dst "#OpcodeStr#" $imm",
>                    [(set GPR32:$dst, (OpNode GPR32:$src2,
>                    i32immSExt32:$imm))]>;
>
>
> If generating from source, the pattern [(set GPR32:$dst, (OpNode
> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32
> and it won't match and eventually a LDimm_64 insn will be generated.

If by "generating from source" you mean compiling from C, then sure, I
wasn't implying clang was generating `r1 = 0xffffffff' for assigning
that positive value to a register.

> But for inline asm, we will have
>   (outs GPR32:$dst)
>   (ins GPR32:$src2, i32imm:$imm)
>
> and i32imm is defined as
>   def i32imm : Operand<i32>;
> which is a unsigned 32bit value, so it is recognized properly
> and the insn is encoded properly.

We thought the imm32 operand in ALU instructions is signed, not
unsigned.  Is it really unsigned??

>> Using negative numbers to denote masks is ugly and obfuscating (for
>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
>> pseudo-op so we can do:
>>     w1 = %mask(0xffffffff)
>
> I changed above
>   w1 = 0xffffffff;
> to
>   w1 = %mask(0xffffffff)
> and hit the following compilation failure.
>
> progs/verifier_masking.c:54:9: error: invalid % escape in inline
> assembly string
>    53 |         asm volatile ("                                 \
>       |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    54 |         w1 = %mask(0xffffffff);                         \
>       |                ^
> 1 error generated.
>
> Do you have documentation what is '%mask' thing?

It doesn't exist.

I am suggesting to add support for that pseudo-op to the BPF assemblers:
both GAS and the llvm BPF assembler.

>> allowing the assembler to do the right thing (TM) converting and
>> checking that the mask is valid and not relying on UB.
>> Thoughts?
>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-15 17:01   ` Jose E. Marchesi
@ 2023-08-15 17:28     ` Yonghong Song
  2023-08-16  9:36     ` Jose E. Marchesi
  1 sibling, 0 replies; 12+ messages in thread
From: Yonghong Song @ 2023-08-15 17:28 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: bpf, david.faust, cupertino.miranda



On 8/15/23 10:01 AM, Jose E. Marchesi wrote:
> 
>> On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
>>> Hello.
>>> The selftest progs/verifier_masking.c contains inline assembly code
>>> like:
>>>     	w1 = 0xffffffff;
>>> The 32-bit immediate of that instruction is signed.  Therefore, GAS
>>> complains that the above instruction overflows its field:
>>>     /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall
>>> fit in 32 bits
>>> The llvm assembler is likely relying on signed overflow for the
>>> above to
>>> work.
>>
>> Not really.
>>
>>    def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
>>                     (outs GPR32:$dst),
>>                     (ins GPR32:$src2, i32imm:$imm),
>>                     "$dst "#OpcodeStr#" $imm",
>>                     [(set GPR32:$dst, (OpNode GPR32:$src2,
>>                     i32immSExt32:$imm))]>;
>>
>>
>> If generating from source, the pattern [(set GPR32:$dst, (OpNode
>> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32
>> and it won't match and eventually a LDimm_64 insn will be generated.
> 
> If by "generating from source" you mean compiling from C, then sure, I
> wasn't implying clang was generating `r1 = 0xffffffff' for assigning
> that positive value to a register.
> 
>> But for inline asm, we will have
>>    (outs GPR32:$dst)
>>    (ins GPR32:$src2, i32imm:$imm)
>>
>> and i32imm is defined as
>>    def i32imm : Operand<i32>;
>> which is a unsigned 32bit value, so it is recognized properly
>> and the insn is encoded properly.
> 
> We thought the imm32 operand in ALU instructions is signed, not
> unsigned.  Is it really unsigned??

The 'i32' in LLVM just represents a 4-byte value, there is no
signed-ness attached, which I interpret it as unsigned.
See below example,

$ cat t.c
int a;
unsigned b;
long c;
long add1() { return a + c; }
long add2() { return b + c; }
$ clang --target=bpf -O2 -S -emit-llvm t.c
$ cat t.ll
; ModuleID = 't.c'
source_filename = "t.c"
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "bpf"

@a = dso_local local_unnamed_addr global i32 0, align 4
@c = dso_local local_unnamed_addr global i64 0, align 8
@b = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: mustprogress nofree norecurse nosync nounwind 
willreturn memory(read, argmem: none, inaccessiblemem: none)
define dso_local i64 @add1() local_unnamed_addr #0 {
entry:
   %0 = load i32, ptr @a, align 4, !tbaa !3
   %conv = sext i32 %0 to i64
   %1 = load i64, ptr @c, align 8, !tbaa !7
   %add = add nsw i64 %1, %conv
   ret i64 %add
}

; Function Attrs: mustprogress nofree norecurse nosync nounwind 
willreturn memory(read, argmem: none, inaccessiblemem: none)
define dso_local i64 @add2() local_unnamed_addr #0 {
entry:
   %0 = load i32, ptr @b, align 4, !tbaa !3
   %conv = zext i32 %0 to i64
   %1 = load i64, ptr @c, align 8, !tbaa !7
   %add = add nsw i64 %1, %conv
   ret i64 %add
}

You can see global variables 'a', 'b' and 'c' are defined
as 'i32' and 'i64' respectively. The signed/unsigned-ness is
used during IR code generation.

> 
>>> Using negative numbers to denote masks is ugly and obfuscating (for
>>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
>>> pseudo-op so we can do:
>>>      w1 = %mask(0xffffffff)
>>
>> I changed above
>>    w1 = 0xffffffff;
>> to
>>    w1 = %mask(0xffffffff)
>> and hit the following compilation failure.
>>
>> progs/verifier_masking.c:54:9: error: invalid % escape in inline
>> assembly string
>>     53 |         asm volatile ("                                 \
>>        |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>     54 |         w1 = %mask(0xffffffff);                         \
>>        |                ^
>> 1 error generated.
>>
>> Do you have documentation what is '%mask' thing?
> 
> It doesn't exist.
> 
> I am suggesting to add support for that pseudo-op to the BPF assemblers:
> both GAS and the llvm BPF assembler.
> 
>>> allowing the assembler to do the right thing (TM) converting and
>>> checking that the mask is valid and not relying on UB.
>>> Thoughts?
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-15 17:01   ` Jose E. Marchesi
  2023-08-15 17:28     ` Yonghong Song
@ 2023-08-16  9:36     ` Jose E. Marchesi
  2023-08-16 16:22       ` Yonghong Song
  1 sibling, 1 reply; 12+ messages in thread
From: Jose E. Marchesi @ 2023-08-16  9:36 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, david.faust, cupertino.miranda


>> On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
>>> Hello.
>>> The selftest progs/verifier_masking.c contains inline assembly code
>>> like:
>>>    	w1 = 0xffffffff;
>>> The 32-bit immediate of that instruction is signed.  Therefore, GAS
>>> complains that the above instruction overflows its field:
>>>    /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall
>>> fit in 32 bits
>>> The llvm assembler is likely relying on signed overflow for the
>>> above to
>>> work.
>>
>> Not really.
>>
>>   def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
>>                    (outs GPR32:$dst),
>>                    (ins GPR32:$src2, i32imm:$imm),
>>                    "$dst "#OpcodeStr#" $imm",
>>                    [(set GPR32:$dst, (OpNode GPR32:$src2,
>>                    i32immSExt32:$imm))]>;
>>
>>
>> If generating from source, the pattern [(set GPR32:$dst, (OpNode
>> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32
>> and it won't match and eventually a LDimm_64 insn will be generated.
>
> If by "generating from source" you mean compiling from C, then sure, I
> wasn't implying clang was generating `r1 = 0xffffffff' for assigning
> that positive value to a register.
>
>> But for inline asm, we will have
>>   (outs GPR32:$dst)
>>   (ins GPR32:$src2, i32imm:$imm)
>>
>> and i32imm is defined as
>>   def i32imm : Operand<i32>;
>> which is a unsigned 32bit value, so it is recognized properly
>> and the insn is encoded properly.
>
> We thought the imm32 operand in ALU instructions is signed, not
> unsigned.  Is it really unsigned??

I am going through all the BPF instructions that get 32-bit, 16-bit and
64-bit immediates, because it seems to me that we may need to
distinguish between two different levels:

- Value encoded in the instruction immediate: interpreted as signed or
  as unsigned.

- How the assembler interprets a written number for the corresponding
  instruction operand: for example, for which instructions the assemler
  shall accept 0xfffffffe and 4294967294 and -2 all to denote the same
  value, what value is it (negative or positive) or shall it emit an
  overflow error.

Will follow up with a summary that hopefully will serve to clarify this.

>>> Using negative numbers to denote masks is ugly and obfuscating (for
>>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
>>> pseudo-op so we can do:
>>>     w1 = %mask(0xffffffff)
>>
>> I changed above
>>   w1 = 0xffffffff;
>> to
>>   w1 = %mask(0xffffffff)
>> and hit the following compilation failure.
>>
>> progs/verifier_masking.c:54:9: error: invalid % escape in inline
>> assembly string
>>    53 |         asm volatile ("                                 \
>>       |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    54 |         w1 = %mask(0xffffffff);                         \
>>       |                ^
>> 1 error generated.
>>
>> Do you have documentation what is '%mask' thing?
>
> It doesn't exist.
>
> I am suggesting to add support for that pseudo-op to the BPF assemblers:
> both GAS and the llvm BPF assembler.
>
>>> allowing the assembler to do the right thing (TM) converting and
>>> checking that the mask is valid and not relying on UB.
>>> Thoughts?
>>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-16  9:36     ` Jose E. Marchesi
@ 2023-08-16 16:22       ` Yonghong Song
  2023-08-17  8:01         ` Jose E. Marchesi
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2023-08-16 16:22 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: bpf, david.faust, cupertino.miranda



On 8/16/23 2:36 AM, Jose E. Marchesi wrote:
> 
>>> On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
>>>> Hello.
>>>> The selftest progs/verifier_masking.c contains inline assembly code
>>>> like:
>>>>     	w1 = 0xffffffff;
>>>> The 32-bit immediate of that instruction is signed.  Therefore, GAS
>>>> complains that the above instruction overflows its field:
>>>>     /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall
>>>> fit in 32 bits
>>>> The llvm assembler is likely relying on signed overflow for the
>>>> above to
>>>> work.
>>>
>>> Not really.
>>>
>>>    def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
>>>                     (outs GPR32:$dst),
>>>                     (ins GPR32:$src2, i32imm:$imm),
>>>                     "$dst "#OpcodeStr#" $imm",
>>>                     [(set GPR32:$dst, (OpNode GPR32:$src2,
>>>                     i32immSExt32:$imm))]>;
>>>
>>>
>>> If generating from source, the pattern [(set GPR32:$dst, (OpNode
>>> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32
>>> and it won't match and eventually a LDimm_64 insn will be generated.
>>
>> If by "generating from source" you mean compiling from C, then sure, I
>> wasn't implying clang was generating `r1 = 0xffffffff' for assigning
>> that positive value to a register.
>>
>>> But for inline asm, we will have
>>>    (outs GPR32:$dst)
>>>    (ins GPR32:$src2, i32imm:$imm)
>>>
>>> and i32imm is defined as
>>>    def i32imm : Operand<i32>;
>>> which is a unsigned 32bit value, so it is recognized properly
>>> and the insn is encoded properly.
>>
>> We thought the imm32 operand in ALU instructions is signed, not
>> unsigned.  Is it really unsigned??
> 
> I am going through all the BPF instructions that get 32-bit, 16-bit and
> 64-bit immediates, because it seems to me that we may need to
> distinguish between two different levels:
> 
> - Value encoded in the instruction immediate: interpreted as signed or
>    as unsigned.

The 'imm' in the instruction is a 32-bit signed insn.
I think we have no dispute here.

> 
> - How the assembler interprets a written number for the corresponding
>    instruction operand: for example, for which instructions the assemler
>    shall accept 0xfffffffe and 4294967294 and -2 all to denote the same
>    value, what value is it (negative or positive) or shall it emit an
>    overflow error.

In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
4-byte bit-wise encoding, so they will be all encoded the same
0xfffffffe in the actual insn.

The following is an example for x86 target in llvm:

$ cat t.c
int foo() {
   int a, b;

   asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
   asm volatile("movl $-2, %0" : "=r"(b) :);
   return a + b;
}
$ clang -O2 -c t.c
$ llvm-objdump -d t.o

t.o:    file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <foo>:
        0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx 
# imm = 0xFFFFFFFE
        5: b8 fe ff ff ff                movl    $0xfffffffe, %eax 
# imm = 0xFFFFFFFE
        a: 01 c8                         addl    %ecx, %eax
        c: c3                            retq
$

Whether it is 0xfffffffe or -2, the insn encoding is the same
and disasm prints out 0xfffffffe.

> 
> Will follow up with a summary that hopefully will serve to clarify this.
> 
>>>> Using negative numbers to denote masks is ugly and obfuscating (for
>>>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
>>>> pseudo-op so we can do:
>>>>      w1 = %mask(0xffffffff)
>>>
>>> I changed above
>>>    w1 = 0xffffffff;
>>> to
>>>    w1 = %mask(0xffffffff)
>>> and hit the following compilation failure.
>>>
>>> progs/verifier_masking.c:54:9: error: invalid % escape in inline
>>> assembly string
>>>     53 |         asm volatile ("                                 \
>>>        |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>     54 |         w1 = %mask(0xffffffff);                         \
>>>        |                ^
>>> 1 error generated.
>>>
>>> Do you have documentation what is '%mask' thing?
>>
>> It doesn't exist.
>>
>> I am suggesting to add support for that pseudo-op to the BPF assemblers:
>> both GAS and the llvm BPF assembler.
>>
>>>> allowing the assembler to do the right thing (TM) converting and
>>>> checking that the mask is valid and not relying on UB.
>>>> Thoughts?
>>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-16 16:22       ` Yonghong Song
@ 2023-08-17  8:01         ` Jose E. Marchesi
  2023-08-17 16:23           ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Jose E. Marchesi @ 2023-08-17  8:01 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, david.faust, cupertino.miranda


> [...]
> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
> 4-byte bit-wise encoding, so they will be all encoded the same
> 0xfffffffe in the actual insn.
>
> The following is an example for x86 target in llvm:
>
> $ cat t.c
> int foo() {
>   int a, b;
>
>   asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>   asm volatile("movl $-2, %0" : "=r"(b) :);
>   return a + b;
> }
> $ clang -O2 -c t.c
> $ llvm-objdump -d t.o
>
> t.o:    file format elf64-x86-64
>
> Disassembly of section .text:
>
> 0000000000000000 <foo>:
>        0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx #
>       imm = 0xFFFFFFFE
>        5: b8 fe ff ff ff                movl    $0xfffffffe, %eax #
>       imm = 0xFFFFFFFE
>        a: 01 c8                         addl    %ecx, %eax
>        c: c3                            retq
> $
>
> Whether it is 0xfffffffe or -2, the insn encoding is the same
> and disasm prints out 0xfffffffe.

Thanks for the explanation.

I have pushed the commit below to binutils that makes GAS match the llvm
assembler behavior regarding constant immediates.  With this patch there
are no more assembler errors when building the kernel bpf selftests.

Note however that there is one pending divergence in the behavior of
both assemblers when facing invalid programs where immediate operands
cannot be represented in the number of bits of the field like in:

  $ cat foo.s
  if r1 > r2 goto 0x3fff1

llvm silently truncates it to 16-bit:

  $ clang -target bpf foo.s
  $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
  0000000000000000 <.text>:
     0:	2d 21 f1 ff 00 00 00 00 	if r1>r2 goto -15

GAS emits an error instead:

  $ as -mdialect=pseudoc foo.s
  foo.s: Assembler messages:
  foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.

(The same happens with 32-bit immediates.)

We think the error is pertinent, and we recommend the llvm assembler to
behave the same way.

commit 5be1b787276d2adbe85ae7febc709ca517b62f08
Author: Jose E. Marchesi <jose.marchesi@oracle.com>
Date:   Thu Aug 17 09:38:37 2023 +0200

    bpf: gas: consolidate handling of immediate overflows
    
    This commit changes the BPF GAS port in order to handle immediate
    overflows the same way than the clang BPF assembler:
    
    - For an immediate field of N bits, any written number (positive or
      negative) whose two's complement encoding fit in N its is accepted.
      This means that -2 is the same than 0xffffffe.  It is up to the
      instructions to decide how to interpret the encoded value.
    
    - Immediate fields in jump instructions are no longer relaxed.
      Relaxing to jump instructions with wider range is only performed
      when expressions are involved.
    
    - The manual is updated to document this, and testsuite adapted
      accordingly.
    
    Tested in x86_64-linux-gnu host, bpf-unknown-none target.
    
    gas/ChangeLog:
    
    2023-08-17  Jose E. Marchesi  <jose.marchesi@oracle.com>
    
            * config/tc-bpf.c (check_immediate_overflow): New function.
            (encode_insn): Use check_immediate_overflow.
            (md_assemble): Do not relax instructions with
            constant disp16 fields.
            * doc/c-bpf.texi (BPF Instructions): Add note about how numerical
            literal values are interpreted for instruction immediate operands.
            * testsuite/gas/bpf/disp16-overflow.s: Adapt accordingly.
            * testsuite/gas/bpf/jump-relax-jump.s: Likewise.
            * testsuite/gas/bpf/jump-relax-jump.d: Likewise.
            * testsuite/gas/bpf/jump-relax-jump-be.d: Likewise.
            * testsuite/gas/bpf/jump-relax-ja.s: Likewise.
            * testsuite/gas/bpf/jump-relax-ja.d: Likewise.
            * testsuite/gas/bpf/jump-relax-ja-be.d: Likewise.
            * testsuite/gas/bpf/disp16-overflow-relax.l: Likewise.
            * testsuite/gas/bpf/imm32-overflow.s: Likewise.
            * testsuite/gas/bpf/disp32-overflow.s: Likewise.
            * testsuite/gas/bpf/disp16-overflow.l: Likewise.
            * testsuite/gas/bpf/disp32-overflow.l: Likewise.
            * testsuite/gas/bpf/imm32-overflow.l: Likewise.
            * testsuite/gas/bpf/offset16-overflow.l: Likewise.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-17  8:01         ` Jose E. Marchesi
@ 2023-08-17 16:23           ` Yonghong Song
  2023-08-17 17:14             ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2023-08-17 16:23 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: bpf, david.faust, cupertino.miranda



On 8/17/23 1:01 AM, Jose E. Marchesi wrote:
> 
>> [...]
>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
>> 4-byte bit-wise encoding, so they will be all encoded the same
>> 0xfffffffe in the actual insn.
>>
>> The following is an example for x86 target in llvm:
>>
>> $ cat t.c
>> int foo() {
>>    int a, b;
>>
>>    asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>>    asm volatile("movl $-2, %0" : "=r"(b) :);
>>    return a + b;
>> }
>> $ clang -O2 -c t.c
>> $ llvm-objdump -d t.o
>>
>> t.o:    file format elf64-x86-64
>>
>> Disassembly of section .text:
>>
>> 0000000000000000 <foo>:
>>         0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx #
>>        imm = 0xFFFFFFFE
>>         5: b8 fe ff ff ff                movl    $0xfffffffe, %eax #
>>        imm = 0xFFFFFFFE
>>         a: 01 c8                         addl    %ecx, %eax
>>         c: c3                            retq
>> $
>>
>> Whether it is 0xfffffffe or -2, the insn encoding is the same
>> and disasm prints out 0xfffffffe.
> 
> Thanks for the explanation.
> 
> I have pushed the commit below to binutils that makes GAS match the llvm
> assembler behavior regarding constant immediates.  With this patch there
> are no more assembler errors when building the kernel bpf selftests.

Great! Thanks.

> 
> Note however that there is one pending divergence in the behavior of
> both assemblers when facing invalid programs where immediate operands
> cannot be represented in the number of bits of the field like in:
> 
>    $ cat foo.s
>    if r1 > r2 goto 0x3fff1
> 
> llvm silently truncates it to 16-bit:
> 
>    $ clang -target bpf foo.s
>    $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
>    0000000000000000 <.text>:
>       0:	2d 21 f1 ff 00 00 00 00 	if r1>r2 goto -15
> 
> GAS emits an error instead:
> 
>    $ as -mdialect=pseudoc foo.s
>    foo.s: Assembler messages:
>    foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.
> 
> (The same happens with 32-bit immediates.)
> 
> We think the error is pertinent, and we recommend the llvm assembler to
> behave the same way.

Thanks! We will take a look at this issue soon.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-17 16:23           ` Yonghong Song
@ 2023-08-17 17:14             ` Yonghong Song
  2023-08-17 17:37               ` Jose E. Marchesi
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2023-08-17 17:14 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: bpf, david.faust, cupertino.miranda



On 8/17/23 9:23 AM, Yonghong Song wrote:
> 
> 
> On 8/17/23 1:01 AM, Jose E. Marchesi wrote:
>>
>>> [...]
>>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
>>> 4-byte bit-wise encoding, so they will be all encoded the same
>>> 0xfffffffe in the actual insn.
>>>
>>> The following is an example for x86 target in llvm:
>>>
>>> $ cat t.c
>>> int foo() {
>>>    int a, b;
>>>
>>>    asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>>>    asm volatile("movl $-2, %0" : "=r"(b) :);
>>>    return a + b;
>>> }
>>> $ clang -O2 -c t.c
>>> $ llvm-objdump -d t.o
>>>
>>> t.o:    file format elf64-x86-64
>>>
>>> Disassembly of section .text:
>>>
>>> 0000000000000000 <foo>:
>>>         0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx #
>>>        imm = 0xFFFFFFFE
>>>         5: b8 fe ff ff ff                movl    $0xfffffffe, %eax #
>>>        imm = 0xFFFFFFFE
>>>         a: 01 c8                         addl    %ecx, %eax
>>>         c: c3                            retq
>>> $
>>>
>>> Whether it is 0xfffffffe or -2, the insn encoding is the same
>>> and disasm prints out 0xfffffffe.
>>
>> Thanks for the explanation.
>>
>> I have pushed the commit below to binutils that makes GAS match the llvm
>> assembler behavior regarding constant immediates.  With this patch there
>> are no more assembler errors when building the kernel bpf selftests.
> 
> Great! Thanks.
> 
>>
>> Note however that there is one pending divergence in the behavior of
>> both assemblers when facing invalid programs where immediate operands
>> cannot be represented in the number of bits of the field like in:
>>
>>    $ cat foo.s
>>    if r1 > r2 goto 0x3fff1
>>
>> llvm silently truncates it to 16-bit:
>>
>>    $ clang -target bpf foo.s
>>    $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
>>    0000000000000000 <.text>:
>>       0:    2d 21 f1 ff 00 00 00 00     if r1>r2 goto -15
>>
>> GAS emits an error instead:
>>
>>    $ as -mdialect=pseudoc foo.s
>>    foo.s: Assembler messages:
>>    foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.
>>
>> (The same happens with 32-bit immediates.)
>>
>> We think the error is pertinent, and we recommend the llvm assembler to
>> behave the same way.
> 
> Thanks! We will take a look at this issue soon.

A patch like below can issue the warning for the above case:

diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp 
b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
index 420a2aad480a..fca6bf30fb4b 100644
--- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
+++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
@@ -136,6 +136,12 @@ void BPFMCCodeEmitter::encodeInstruction(const 
MCInst &MI,
      OSE.write<uint16_t>(0);
      OSE.write<uint32_t>(Imm >> 32);
    } else {
+    if (Opcode == BPF::JUGT_rr) {
+      const MCOperand &MO = MI.getOperand(2);
+      int64_t Imm = MO.isImm() ? MO.getImm() : 0;
+      if (Imm > INT16_MAX || Imm < INT16_MIN)
+        report_fatal_error("Branch target out of insn range");
+    }
      // Get instruction encoding and emit it
      uint64_t Value = getBinaryCodeForInstr(MI, Fixups, STI);
      CB.push_back(Value >> 56);

Need to generalize to other related conditional/unconditional
operands. Will have a formal patch for llvm soon.

Thanks.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-17 17:14             ` Yonghong Song
@ 2023-08-17 17:37               ` Jose E. Marchesi
  2023-08-17 17:44                 ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Jose E. Marchesi @ 2023-08-17 17:37 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, david.faust, cupertino.miranda


> On 8/17/23 9:23 AM, Yonghong Song wrote:
>> On 8/17/23 1:01 AM, Jose E. Marchesi wrote:
>>>
>>>> [...]
>>>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
>>>> 4-byte bit-wise encoding, so they will be all encoded the same
>>>> 0xfffffffe in the actual insn.
>>>>
>>>> The following is an example for x86 target in llvm:
>>>>
>>>> $ cat t.c
>>>> int foo() {
>>>>    int a, b;
>>>>
>>>>    asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>>>>    asm volatile("movl $-2, %0" : "=r"(b) :);
>>>>    return a + b;
>>>> }
>>>> $ clang -O2 -c t.c
>>>> $ llvm-objdump -d t.o
>>>>
>>>> t.o:    file format elf64-x86-64
>>>>
>>>> Disassembly of section .text:
>>>>
>>>> 0000000000000000 <foo>:
>>>>         0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx #
>>>>        imm = 0xFFFFFFFE
>>>>         5: b8 fe ff ff ff                movl    $0xfffffffe, %eax #
>>>>        imm = 0xFFFFFFFE
>>>>         a: 01 c8                         addl    %ecx, %eax
>>>>         c: c3                            retq
>>>> $
>>>>
>>>> Whether it is 0xfffffffe or -2, the insn encoding is the same
>>>> and disasm prints out 0xfffffffe.
>>>
>>> Thanks for the explanation.
>>>
>>> I have pushed the commit below to binutils that makes GAS match the llvm
>>> assembler behavior regarding constant immediates.  With this patch there
>>> are no more assembler errors when building the kernel bpf selftests.
>> Great! Thanks.
>> 
>>>
>>> Note however that there is one pending divergence in the behavior of
>>> both assemblers when facing invalid programs where immediate operands
>>> cannot be represented in the number of bits of the field like in:
>>>
>>>    $ cat foo.s
>>>    if r1 > r2 goto 0x3fff1
>>>
>>> llvm silently truncates it to 16-bit:
>>>
>>>    $ clang -target bpf foo.s
>>>    $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
>>>    0000000000000000 <.text>:
>>>       0:    2d 21 f1 ff 00 00 00 00     if r1>r2 goto -15
>>>
>>> GAS emits an error instead:
>>>
>>>    $ as -mdialect=pseudoc foo.s
>>>    foo.s: Assembler messages:
>>>    foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.
>>>
>>> (The same happens with 32-bit immediates.)
>>>
>>> We think the error is pertinent, and we recommend the llvm assembler to
>>> behave the same way.
>> Thanks! We will take a look at this issue soon.
>
> A patch like below can issue the warning for the above case:
>
> diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
> b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
> index 420a2aad480a..fca6bf30fb4b 100644
> --- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
> +++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
> @@ -136,6 +136,12 @@ void BPFMCCodeEmitter::encodeInstruction(const
> MCInst &MI,
>      OSE.write<uint16_t>(0);
>      OSE.write<uint32_t>(Imm >> 32);
>    } else {
> +    if (Opcode == BPF::JUGT_rr) {
> +      const MCOperand &MO = MI.getOperand(2);
> +      int64_t Imm = MO.isImm() ? MO.getImm() : 0;
> +      if (Imm > INT16_MAX || Imm < INT16_MIN)

Shouldn't that be:

  if (Imm > UINT16_MAX || Imm < INT16_MIN)

?

> +        report_fatal_error("Branch target out of insn range");
> +    }
>      // Get instruction encoding and emit it
>      uint64_t Value = getBinaryCodeForInstr(MI, Fixups, STI);
>      CB.push_back(Value >> 56);
>
> Need to generalize to other related conditional/unconditional
> operands. Will have a formal patch for llvm soon.
>
> Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-17 17:37               ` Jose E. Marchesi
@ 2023-08-17 17:44                 ` Yonghong Song
  2023-08-17 18:06                   ` Jose E. Marchesi
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2023-08-17 17:44 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: bpf, david.faust, cupertino.miranda



On 8/17/23 10:37 AM, Jose E. Marchesi wrote:
> 
>> On 8/17/23 9:23 AM, Yonghong Song wrote:
>>> On 8/17/23 1:01 AM, Jose E. Marchesi wrote:
>>>>
>>>>> [...]
>>>>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
>>>>> 4-byte bit-wise encoding, so they will be all encoded the same
>>>>> 0xfffffffe in the actual insn.
>>>>>
>>>>> The following is an example for x86 target in llvm:
>>>>>
>>>>> $ cat t.c
>>>>> int foo() {
>>>>>     int a, b;
>>>>>
>>>>>     asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>>>>>     asm volatile("movl $-2, %0" : "=r"(b) :);
>>>>>     return a + b;
>>>>> }
>>>>> $ clang -O2 -c t.c
>>>>> $ llvm-objdump -d t.o
>>>>>
>>>>> t.o:    file format elf64-x86-64
>>>>>
>>>>> Disassembly of section .text:
>>>>>
>>>>> 0000000000000000 <foo>:
>>>>>          0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx #
>>>>>         imm = 0xFFFFFFFE
>>>>>          5: b8 fe ff ff ff                movl    $0xfffffffe, %eax #
>>>>>         imm = 0xFFFFFFFE
>>>>>          a: 01 c8                         addl    %ecx, %eax
>>>>>          c: c3                            retq
>>>>> $
>>>>>
>>>>> Whether it is 0xfffffffe or -2, the insn encoding is the same
>>>>> and disasm prints out 0xfffffffe.
>>>>
>>>> Thanks for the explanation.
>>>>
>>>> I have pushed the commit below to binutils that makes GAS match the llvm
>>>> assembler behavior regarding constant immediates.  With this patch there
>>>> are no more assembler errors when building the kernel bpf selftests.
>>> Great! Thanks.
>>>
>>>>
>>>> Note however that there is one pending divergence in the behavior of
>>>> both assemblers when facing invalid programs where immediate operands
>>>> cannot be represented in the number of bits of the field like in:
>>>>
>>>>     $ cat foo.s
>>>>     if r1 > r2 goto 0x3fff1
>>>>
>>>> llvm silently truncates it to 16-bit:
>>>>
>>>>     $ clang -target bpf foo.s
>>>>     $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
>>>>     0000000000000000 <.text>:
>>>>        0:    2d 21 f1 ff 00 00 00 00     if r1>r2 goto -15
>>>>
>>>> GAS emits an error instead:
>>>>
>>>>     $ as -mdialect=pseudoc foo.s
>>>>     foo.s: Assembler messages:
>>>>     foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.
>>>>
>>>> (The same happens with 32-bit immediates.)
>>>>
>>>> We think the error is pertinent, and we recommend the llvm assembler to
>>>> behave the same way.
>>> Thanks! We will take a look at this issue soon.
>>
>> A patch like below can issue the warning for the above case:
>>
>> diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>> b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>> index 420a2aad480a..fca6bf30fb4b 100644
>> --- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>> +++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>> @@ -136,6 +136,12 @@ void BPFMCCodeEmitter::encodeInstruction(const
>> MCInst &MI,
>>       OSE.write<uint16_t>(0);
>>       OSE.write<uint32_t>(Imm >> 32);
>>     } else {
>> +    if (Opcode == BPF::JUGT_rr) {
>> +      const MCOperand &MO = MI.getOperand(2);
>> +      int64_t Imm = MO.isImm() ? MO.getImm() : 0;
>> +      if (Imm > INT16_MAX || Imm < INT16_MIN)
> 
> Shouldn't that be:
> 
>    if (Imm > UINT16_MAX || Imm < INT16_MIN)

The number 'Imm' represents true offset (positive or negative)
as represented in .s file.
So positive offset 0xfffffffe cannot be presented.
The encoding in insn with 0xfffffffe actually means -2.

> 
> ?
> 
>> +        report_fatal_error("Branch target out of insn range");
>> +    }
>>       // Get instruction encoding and emit it
>>       uint64_t Value = getBinaryCodeForInstr(MI, Fixups, STI);
>>       CB.push_back(Value >> 56);
>>
>> Need to generalize to other related conditional/unconditional
>> operands. Will have a formal patch for llvm soon.
>>
>> Thanks.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Masks and overflow of signed immediates in BPF instructions
  2023-08-17 17:44                 ` Yonghong Song
@ 2023-08-17 18:06                   ` Jose E. Marchesi
  0 siblings, 0 replies; 12+ messages in thread
From: Jose E. Marchesi @ 2023-08-17 18:06 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, david.faust, cupertino.miranda


> On 8/17/23 10:37 AM, Jose E. Marchesi wrote:
>> 
>>> On 8/17/23 9:23 AM, Yonghong Song wrote:
>>>> On 8/17/23 1:01 AM, Jose E. Marchesi wrote:
>>>>>
>>>>>> [...]
>>>>>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
>>>>>> 4-byte bit-wise encoding, so they will be all encoded the same
>>>>>> 0xfffffffe in the actual insn.
>>>>>>
>>>>>> The following is an example for x86 target in llvm:
>>>>>>
>>>>>> $ cat t.c
>>>>>> int foo() {
>>>>>>     int a, b;
>>>>>>
>>>>>>     asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>>>>>>     asm volatile("movl $-2, %0" : "=r"(b) :);
>>>>>>     return a + b;
>>>>>> }
>>>>>> $ clang -O2 -c t.c
>>>>>> $ llvm-objdump -d t.o
>>>>>>
>>>>>> t.o:    file format elf64-x86-64
>>>>>>
>>>>>> Disassembly of section .text:
>>>>>>
>>>>>> 0000000000000000 <foo>:
>>>>>>          0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx #
>>>>>>         imm = 0xFFFFFFFE
>>>>>>          5: b8 fe ff ff ff                movl    $0xfffffffe, %eax #
>>>>>>         imm = 0xFFFFFFFE
>>>>>>          a: 01 c8                         addl    %ecx, %eax
>>>>>>          c: c3                            retq
>>>>>> $
>>>>>>
>>>>>> Whether it is 0xfffffffe or -2, the insn encoding is the same
>>>>>> and disasm prints out 0xfffffffe.
>>>>>
>>>>> Thanks for the explanation.
>>>>>
>>>>> I have pushed the commit below to binutils that makes GAS match the llvm
>>>>> assembler behavior regarding constant immediates.  With this patch there
>>>>> are no more assembler errors when building the kernel bpf selftests.
>>>> Great! Thanks.
>>>>
>>>>>
>>>>> Note however that there is one pending divergence in the behavior of
>>>>> both assemblers when facing invalid programs where immediate operands
>>>>> cannot be represented in the number of bits of the field like in:
>>>>>
>>>>>     $ cat foo.s
>>>>>     if r1 > r2 goto 0x3fff1
>>>>>
>>>>> llvm silently truncates it to 16-bit:
>>>>>
>>>>>     $ clang -target bpf foo.s
>>>>>     $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
>>>>>     0000000000000000 <.text>:
>>>>>        0:    2d 21 f1 ff 00 00 00 00     if r1>r2 goto -15
>>>>>
>>>>> GAS emits an error instead:
>>>>>
>>>>>     $ as -mdialect=pseudoc foo.s
>>>>>     foo.s: Assembler messages:
>>>>>     foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.
>>>>>
>>>>> (The same happens with 32-bit immediates.)
>>>>>
>>>>> We think the error is pertinent, and we recommend the llvm assembler to
>>>>> behave the same way.
>>>> Thanks! We will take a look at this issue soon.
>>>
>>> A patch like below can issue the warning for the above case:
>>>
>>> diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> index 420a2aad480a..fca6bf30fb4b 100644
>>> --- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> +++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> @@ -136,6 +136,12 @@ void BPFMCCodeEmitter::encodeInstruction(const
>>> MCInst &MI,
>>>       OSE.write<uint16_t>(0);
>>>       OSE.write<uint32_t>(Imm >> 32);
>>>     } else {
>>> +    if (Opcode == BPF::JUGT_rr) {
>>> +      const MCOperand &MO = MI.getOperand(2);
>>> +      int64_t Imm = MO.isImm() ? MO.getImm() : 0;
>>> +      if (Imm > INT16_MAX || Imm < INT16_MIN)
>> Shouldn't that be:
>>    if (Imm > UINT16_MAX || Imm < INT16_MIN)
>
> The number 'Imm' represents true offset (positive or negative)
> as represented in .s file.
> So positive offset 0xfffffffe cannot be presented.
> The encoding in insn with 0xfffffffe actually means -2.

Oh ok, so thats the value already encoded :)

>> ?
>> 
>>> +        report_fatal_error("Branch target out of insn range");
>>> +    }
>>>       // Get instruction encoding and emit it
>>>       uint64_t Value = getBinaryCodeForInstr(MI, Fixups, STI);
>>>       CB.push_back(Value >> 56);
>>>
>>> Need to generalize to other related conditional/unconditional
>>> operands. Will have a formal patch for llvm soon.
>>>
>>> Thanks.
>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-08-17 18:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-15 14:19 Masks and overflow of signed immediates in BPF instructions Jose E. Marchesi
2023-08-15 16:12 ` Yonghong Song
2023-08-15 17:01   ` Jose E. Marchesi
2023-08-15 17:28     ` Yonghong Song
2023-08-16  9:36     ` Jose E. Marchesi
2023-08-16 16:22       ` Yonghong Song
2023-08-17  8:01         ` Jose E. Marchesi
2023-08-17 16:23           ` Yonghong Song
2023-08-17 17:14             ` Yonghong Song
2023-08-17 17:37               ` Jose E. Marchesi
2023-08-17 17:44                 ` Yonghong Song
2023-08-17 18:06                   ` Jose E. Marchesi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox