From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Yonghong Song <yonghong.song@linux.dev>
Cc: bpf@vger.kernel.org, david.faust@oracle.com,
cupertino.miranda@oracle.com
Subject: Re: Masks and overflow of signed immediates in BPF instructions
Date: Thu, 17 Aug 2023 20:06:05 +0200 [thread overview]
Message-ID: <87edk1wnle.fsf@oracle.com> (raw)
In-Reply-To: <73c939b9-a06b-ee02-f260-39fade8ac1b6@linux.dev> (Yonghong Song's message of "Thu, 17 Aug 2023 10:44:24 -0700")
> On 8/17/23 10:37 AM, Jose E. Marchesi wrote:
>>
>>> On 8/17/23 9:23 AM, Yonghong Song wrote:
>>>> On 8/17/23 1:01 AM, Jose E. Marchesi wrote:
>>>>>
>>>>>> [...]
>>>>>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same
>>>>>> 4-byte bit-wise encoding, so they will be all encoded the same
>>>>>> 0xfffffffe in the actual insn.
>>>>>>
>>>>>> The following is an example for x86 target in llvm:
>>>>>>
>>>>>> $ cat t.c
>>>>>> int foo() {
>>>>>> int a, b;
>>>>>>
>>>>>> asm volatile("movl $0xfffffffe, %0" : "=r"(a) :);
>>>>>> asm volatile("movl $-2, %0" : "=r"(b) :);
>>>>>> return a + b;
>>>>>> }
>>>>>> $ clang -O2 -c t.c
>>>>>> $ llvm-objdump -d t.o
>>>>>>
>>>>>> t.o: file format elf64-x86-64
>>>>>>
>>>>>> Disassembly of section .text:
>>>>>>
>>>>>> 0000000000000000 <foo>:
>>>>>> 0: b9 fe ff ff ff movl $0xfffffffe, %ecx #
>>>>>> imm = 0xFFFFFFFE
>>>>>> 5: b8 fe ff ff ff movl $0xfffffffe, %eax #
>>>>>> imm = 0xFFFFFFFE
>>>>>> a: 01 c8 addl %ecx, %eax
>>>>>> c: c3 retq
>>>>>> $
>>>>>>
>>>>>> Whether it is 0xfffffffe or -2, the insn encoding is the same
>>>>>> and disasm prints out 0xfffffffe.
>>>>>
>>>>> Thanks for the explanation.
>>>>>
>>>>> I have pushed the commit below to binutils that makes GAS match the llvm
>>>>> assembler behavior regarding constant immediates. With this patch there
>>>>> are no more assembler errors when building the kernel bpf selftests.
>>>> Great! Thanks.
>>>>
>>>>>
>>>>> Note however that there is one pending divergence in the behavior of
>>>>> both assemblers when facing invalid programs where immediate operands
>>>>> cannot be represented in the number of bits of the field like in:
>>>>>
>>>>> $ cat foo.s
>>>>> if r1 > r2 goto 0x3fff1
>>>>>
>>>>> llvm silently truncates it to 16-bit:
>>>>>
>>>>> $ clang -target bpf foo.s
>>>>> $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o
>>>>> 0000000000000000 <.text>:
>>>>> 0: 2d 21 f1 ff 00 00 00 00 if r1>r2 goto -15
>>>>>
>>>>> GAS emits an error instead:
>>>>>
>>>>> $ as -mdialect=pseudoc foo.s
>>>>> foo.s: Assembler messages:
>>>>> foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits.
>>>>>
>>>>> (The same happens with 32-bit immediates.)
>>>>>
>>>>> We think the error is pertinent, and we recommend the llvm assembler to
>>>>> behave the same way.
>>>> Thanks! We will take a look at this issue soon.
>>>
>>> A patch like below can issue the warning for the above case:
>>>
>>> diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> index 420a2aad480a..fca6bf30fb4b 100644
>>> --- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> +++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
>>> @@ -136,6 +136,12 @@ void BPFMCCodeEmitter::encodeInstruction(const
>>> MCInst &MI,
>>> OSE.write<uint16_t>(0);
>>> OSE.write<uint32_t>(Imm >> 32);
>>> } else {
>>> + if (Opcode == BPF::JUGT_rr) {
>>> + const MCOperand &MO = MI.getOperand(2);
>>> + int64_t Imm = MO.isImm() ? MO.getImm() : 0;
>>> + if (Imm > INT16_MAX || Imm < INT16_MIN)
>> Shouldn't that be:
>> if (Imm > UINT16_MAX || Imm < INT16_MIN)
>
> The number 'Imm' represents true offset (positive or negative)
> as represented in .s file.
> So positive offset 0xfffffffe cannot be presented.
> The encoding in insn with 0xfffffffe actually means -2.
Oh ok, so thats the value already encoded :)
>> ?
>>
>>> + report_fatal_error("Branch target out of insn range");
>>> + }
>>> // Get instruction encoding and emit it
>>> uint64_t Value = getBinaryCodeForInstr(MI, Fixups, STI);
>>> CB.push_back(Value >> 56);
>>>
>>> Need to generalize to other related conditional/unconditional
>>> operands. Will have a formal patch for llvm soon.
>>>
>>> Thanks.
>>
prev parent reply other threads:[~2023-08-17 18:06 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-15 14:19 Masks and overflow of signed immediates in BPF instructions Jose E. Marchesi
2023-08-15 16:12 ` Yonghong Song
2023-08-15 17:01 ` Jose E. Marchesi
2023-08-15 17:28 ` Yonghong Song
2023-08-16 9:36 ` Jose E. Marchesi
2023-08-16 16:22 ` Yonghong Song
2023-08-17 8:01 ` Jose E. Marchesi
2023-08-17 16:23 ` Yonghong Song
2023-08-17 17:14 ` Yonghong Song
2023-08-17 17:37 ` Jose E. Marchesi
2023-08-17 17:44 ` Yonghong Song
2023-08-17 18:06 ` Jose E. Marchesi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edk1wnle.fsf@oracle.com \
--to=jose.marchesi@oracle.com \
--cc=bpf@vger.kernel.org \
--cc=cupertino.miranda@oracle.com \
--cc=david.faust@oracle.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.