Re: Masks and overflow of signed immediates in BPF instructions

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: bpf@vger.kernel.org, david.faust@oracle.com,
	cupertino.miranda@oracle.com
Subject: Re: Masks and overflow of signed immediates in BPF instructions
Date: Tue, 15 Aug 2023 10:28:58 -0700	[thread overview]
Message-ID: <13d32d42-0924-5533-8407-e4cbc9ee117f@linux.dev> (raw)
In-Reply-To: <87leec44v1.fsf@oracle.com>



On 8/15/23 10:01 AM, Jose E. Marchesi wrote:
> 
>> On 8/15/23 7:19 AM, Jose E. Marchesi wrote:
>>> Hello.
>>> The selftest progs/verifier_masking.c contains inline assembly code
>>> like:
>>>     	w1 = 0xffffffff;
>>> The 32-bit immediate of that instruction is signed.  Therefore, GAS
>>> complains that the above instruction overflows its field:
>>>     /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall
>>> fit in 32 bits
>>> The llvm assembler is likely relying on signed overflow for the
>>> above to
>>> work.
>>
>> Not really.
>>
>>    def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
>>                     (outs GPR32:$dst),
>>                     (ins GPR32:$src2, i32imm:$imm),
>>                     "$dst "#OpcodeStr#" $imm",
>>                     [(set GPR32:$dst, (OpNode GPR32:$src2,
>>                     i32immSExt32:$imm))]>;
>>
>>
>> If generating from source, the pattern [(set GPR32:$dst, (OpNode
>> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32
>> and it won't match and eventually a LDimm_64 insn will be generated.
> 
> If by "generating from source" you mean compiling from C, then sure, I
> wasn't implying clang was generating `r1 = 0xffffffff' for assigning
> that positive value to a register.
> 
>> But for inline asm, we will have
>>    (outs GPR32:$dst)
>>    (ins GPR32:$src2, i32imm:$imm)
>>
>> and i32imm is defined as
>>    def i32imm : Operand<i32>;
>> which is a unsigned 32bit value, so it is recognized properly
>> and the insn is encoded properly.
> 
> We thought the imm32 operand in ALU instructions is signed, not
> unsigned.  Is it really unsigned??

The 'i32' in LLVM just represents a 4-byte value, there is no
signed-ness attached, which I interpret it as unsigned.
See below example,

$ cat t.c
int a;
unsigned b;
long c;
long add1() { return a + c; }
long add2() { return b + c; }
$ clang --target=bpf -O2 -S -emit-llvm t.c
$ cat t.ll
; ModuleID = 't.c'
source_filename = "t.c"
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "bpf"

@a = dso_local local_unnamed_addr global i32 0, align 4
@c = dso_local local_unnamed_addr global i64 0, align 8
@b = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: mustprogress nofree norecurse nosync nounwind 
willreturn memory(read, argmem: none, inaccessiblemem: none)
define dso_local i64 @add1() local_unnamed_addr #0 {
entry:
   %0 = load i32, ptr @a, align 4, !tbaa !3
   %conv = sext i32 %0 to i64
   %1 = load i64, ptr @c, align 8, !tbaa !7
   %add = add nsw i64 %1, %conv
   ret i64 %add
}

; Function Attrs: mustprogress nofree norecurse nosync nounwind 
willreturn memory(read, argmem: none, inaccessiblemem: none)
define dso_local i64 @add2() local_unnamed_addr #0 {
entry:
   %0 = load i32, ptr @b, align 4, !tbaa !3
   %conv = zext i32 %0 to i64
   %1 = load i64, ptr @c, align 8, !tbaa !7
   %add = add nsw i64 %1, %conv
   ret i64 %add
}

You can see global variables 'a', 'b' and 'c' are defined
as 'i32' and 'i64' respectively. The signed/unsigned-ness is
used during IR code generation.

> 
>>> Using negative numbers to denote masks is ugly and obfuscating (for
>>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a
>>> pseudo-op so we can do:
>>>      w1 = %mask(0xffffffff)
>>
>> I changed above
>>    w1 = 0xffffffff;
>> to
>>    w1 = %mask(0xffffffff)
>> and hit the following compilation failure.
>>
>> progs/verifier_masking.c:54:9: error: invalid % escape in inline
>> assembly string
>>     53 |         asm volatile ("                                 \
>>        |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>     54 |         w1 = %mask(0xffffffff);                         \
>>        |                ^
>> 1 error generated.
>>
>> Do you have documentation what is '%mask' thing?
> 
> It doesn't exist.
> 
> I am suggesting to add support for that pseudo-op to the BPF assemblers:
> both GAS and the llvm BPF assembler.
> 
>>> allowing the assembler to do the right thing (TM) converting and
>>> checking that the mask is valid and not relying on UB.
>>> Thoughts?
>>>

next prev parent reply	other threads:[~2023-08-15 17:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-15 14:19 Masks and overflow of signed immediates in BPF instructions Jose E. Marchesi
2023-08-15 16:12 ` Yonghong Song
2023-08-15 17:01   ` Jose E. Marchesi
2023-08-15 17:28     ` Yonghong Song [this message]
2023-08-16  9:36     ` Jose E. Marchesi
2023-08-16 16:22       ` Yonghong Song
2023-08-17  8:01         ` Jose E. Marchesi
2023-08-17 16:23           ` Yonghong Song
2023-08-17 17:14             ` Yonghong Song
2023-08-17 17:37               ` Jose E. Marchesi
2023-08-17 17:44                 ` Yonghong Song
2023-08-17 18:06                   ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13d32d42-0924-5533-8407-e4cbc9ee117f@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=cupertino.miranda@oracle.com \
    --cc=david.faust@oracle.com \
    --cc=jose.marchesi@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox