Re: Masks and overflow of signed immediates in BPF instructions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: bpf@vger.kernel.org, david.faust@oracle.com,
	cupertino.miranda@oracle.com
Subject: Re: Masks and overflow of signed immediates in BPF instructions
Date: Wed, 16 Aug 2023 09:22:09 -0700	[thread overview]
Message-ID: <bbd86b4e-89ea-8e60-883e-f348117483b4@linux.dev> (raw)
In-Reply-To: <87wmxv2ut4.fsf@oracle.com>



On 8/16/23 2:36 AM, Jose E. Marchesi wrote:
> 
>>> On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
>>>> Hello.
>>>> The selftest progs/verifier_masking.c contains inline assembly code
>>>> like:
>>>>     	w1 = 0xffffffff;
>>>> The 32-bit immediate of that instruction is signed.  Therefore, GAS
>>>> complains that the above instruction overflows its field:
>>>>     /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall
>>>> fit in 32 bits
>>>> The llvm assembler is likely relying on signed overflow for the
>>>> above to
>>>> work.
>>>
>>> Not really.
>>>
>>>    def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
>>>                     (outs GPR32:$dst),
>>>                     (ins GPR32:$src2, i32imm:$imm),
>>>                     "$dst "#OpcodeStr#" $imm",
>>>                     [(set GPR32:$dst, (OpNode GPR32:$src2,
>>>                     i32immSExt32:$imm))]>;
>>>
>>>
>>> If generating from source, the pattern [(set GPR32:$dst, (OpNode
>>> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32
>>> and it won't match and eventually a LDimm_64 insn will be generated.
>>
>> If by "generating from source" you mean compiling from C, then sure, I
>> wasn't implying clang was generating `r1 = 0xffffffff' for assigning
>> that positive value to a register.
>>
>>> But for inline asm, we will have
>>>    (outs GPR32:$dst)
>>>    (ins GPR32:$src2, i32imm:$imm)
>>>
>>> and i32imm is defined as
>>>    def i32imm : Operand<i32>;
>>> which is a unsigned 32bit value, so it is recognized properly
>>> and the insn is encoded properly.
>>
>> We thought the imm32 operand in ALU instructions is signed, not
>> unsigned.  Is it really unsigned??
> 
> I am going through all the BPF instructions that get 32-bit, 16-bit and
> 64-bit immediates, because it seems to me that we may need to
> distinguish between two different levels:
> 
> - Value encoded in the instruction immediate: interpreted as signed or
>    as unsigned.

The 'imm' in the instruction is a 32-bit signed insn.
I think we have no dispute here.

> 
> - How the assembler interprets a written number for the corresponding
>    instruction operand: for example, for which instructions the assemler
>    shall accept 0xfffffffe and 4294967294 and -2 all to denote the same
>    value, what value is it (negative or positive) or shall it emit an
>    overflow error.

In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
4-byte bit-wise encoding, so they will be all encoded the same
0xfffffffe in the actual insn.

The following is an example for x86 target in llvm:

$ cat t.c
int foo() {
   int a, b;

   asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
   asm volatile("movl $-2, %0" : "=r"(b) :);
   return a + b;
}
$ clang -O2 -c t.c
$ llvm-objdump -d t.o

t.o:    file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <foo>:
        0: b9 fe ff ff ff                movl    $0xfffffffe, %ecx 
# imm = 0xFFFFFFFE
        5: b8 fe ff ff ff                movl    $0xfffffffe, %eax 
# imm = 0xFFFFFFFE
        a: 01 c8                         addl    %ecx, %eax
        c: c3                            retq
$

Whether it is 0xfffffffe or -2, the insn encoding is the same
and disasm prints out 0xfffffffe.

> 
> Will follow up with a summary that hopefully will serve to clarify this.
> 
>>>> Using negative numbers to denote masks is ugly and obfuscating (for
>>>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
>>>> pseudo-op so we can do:
>>>>      w1 = %mask(0xffffffff)
>>>
>>> I changed above
>>>    w1 = 0xffffffff;
>>> to
>>>    w1 = %mask(0xffffffff)
>>> and hit the following compilation failure.
>>>
>>> progs/verifier_masking.c:54:9: error: invalid % escape in inline
>>> assembly string
>>>     53 |         asm volatile ("                                 \
>>>        |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>     54 |         w1 = %mask(0xffffffff);                         \
>>>        |                ^
>>> 1 error generated.
>>>
>>> Do you have documentation what is '%mask' thing?
>>
>> It doesn't exist.
>>
>> I am suggesting to add support for that pseudo-op to the BPF assemblers:
>> both GAS and the llvm BPF assembler.
>>
>>>> allowing the assembler to do the right thing (TM) converting and
>>>> checking that the mask is valid and not relying on UB.
>>>> Thoughts?
>>>>

next prev parent reply	other threads:[~2023-08-16 16:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-15 14:19 Masks and overflow of signed immediates in BPF instructions Jose E. Marchesi
2023-08-15 16:12 ` Yonghong Song
2023-08-15 17:01   ` Jose E. Marchesi
2023-08-15 17:28     ` Yonghong Song
2023-08-16  9:36     ` Jose E. Marchesi
2023-08-16 16:22       ` Yonghong Song [this message]
2023-08-17  8:01         ` Jose E. Marchesi
2023-08-17 16:23           ` Yonghong Song
2023-08-17 17:14             ` Yonghong Song
2023-08-17 17:37               ` Jose E. Marchesi
2023-08-17 17:44                 ` Yonghong Song
2023-08-17 18:06                   ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bbd86b4e-89ea-8e60-883e-f348117483b4@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=cupertino.miranda@oracle.com \
    --cc=david.faust@oracle.com \
    --cc=jose.marchesi@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.