Re: Register encoding in assembly for load/store instructions

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Yonghong Song <yonghong.song@linux.dev>
Cc: Yonghong Song <yhs@meta.com>, bpf@vger.kernel.org
Subject: Re: Register encoding in assembly for load/store instructions
Date: Tue, 25 Jul 2023 22:09:47 +0200	[thread overview]
Message-ID: <87zg3jah2s.fsf@oracle.com> (raw)
In-Reply-To: <146bc14b-e15c-6e62-1fa0-4e9e67c974c9@linux.dev> (Yonghong Song's message of "Tue, 25 Jul 2023 12:45:38 -0700")

> On 7/25/23 11:56 AM, Jose E. Marchesi wrote:
>> 
>>> On 7/25/23 10:29 AM, Jose E. Marchesi wrote:
>>>> Hello Yonghong.
>>>> We have noticed that the llvm disassembler uses different notations
>>>> for
>>>> registers in load and store instructions, depending somehow on the width
>>>> of the data being loaded or stored.
>>>> For example, this is an excerpt from the assembler-disassembler.s
>>>> test
>>>> file in llvm:
>>>>     // Note: For the group below w1 is used as a destination for
>>>> sizes u8, u16, u32.
>>>>     //       This is disassembler quirk, but is technically not wrong, as there are
>>>>     //       no different encodings for 'r1 = load' vs 'w1 = load'.
>>>>     //
>>>>     // CHECK: 71 21 2a 00 00 00 00 00	w1 = *(u8 *)(r2 + 0x2a)
>>>>     // CHECK: 69 21 2a 00 00 00 00 00	w1 = *(u16 *)(r2 + 0x2a)
>>>>     // CHECK: 61 21 2a 00 00 00 00 00	w1 = *(u32 *)(r2 + 0x2a)
>>>>     // CHECK: 79 21 2a 00 00 00 00 00	r1 = *(u64 *)(r2 + 0x2a)
>>>>     r1 = *(u8*)(r2 + 42)
>>>>     r1 = *(u16*)(r2 + 42)
>>>>     r1 = *(u32*)(r2 + 42)
>>>>     r1 = *(u64*)(r2 + 42)
>>>> The comment there clarifies that the usage of wN instead of rN in
>>>> the
>>>> u8, u16 and u32 cases is a "disassembler quirk".
>>>> Anyway, the problem is that it seems that `clang -S' actually emits
>>>> these forms with wN.
>>>> Is that intended?
>>>
>>> Yes, this is intended since alu32 mode is enabled where
>>> w* registers are used for 8/16/32 bit load.
>> So then why suppporting 'r1 = 8948 8*9r2 + 0x2a)'?  The mode is
>> still
>> alu32 mode.  Isn't the u{8,16,32} part enough to discriminate?
>
> What does this 'r1 = 8948 8*9r2 + 0x2a)' mean?
>
> For u8/u16/u32 loads, if objdump with option to indicate alu32 mode,
> then w* register is used. If no alu32 mode for objdump, then r* register
> is used. Basically the same insn, disasm is different depending on
> alu32 mode or not. u8/u16/u32 is not enough to differentiate.

Ok, so the llvm objdump has a switch that tells when to use rN or wN
when printing these particular instructions.  Thats the "disassembler
quirk".  To what purpose?  Isnt the person passing the command line
switch the same person reading the disassembled program?  Is this "alu32
mode" more than a cosmetic thing?

But what concern us is the assembler, not the disassembler.

clang -S (which is not objdump) seems to generate these instructions
with wN (see https://godbolt.org/z/5G433Yvrb for a store instruction for
example) and we assume the output of clang -S is intended to be passed
to an assembler, much like with gcc -S.

So, should we support both syntaxes as _input_ syntax in the assembler?

>> 
>>> Note that for newer sign-extended loads, even at alu32 mode,
>>> only r* register is used since the sign-extension extends
>>> upto 64 bits for all variants (8/16/32).
>> Yes we noticed that :)
>> 
>>>
>>>
>>>
>>>>

next prev parent reply	other threads:[~2023-07-25 20:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25 17:29 Register encoding in assembly for load/store instructions Jose E. Marchesi
2023-07-25 18:47 ` Yonghong Song
2023-07-25 18:56   ` Jose E. Marchesi
2023-07-25 19:11     ` Jose E. Marchesi
2023-07-25 19:59       ` Yonghong Song
2023-07-25 19:45     ` Yonghong Song
2023-07-25 20:09       ` Jose E. Marchesi [this message]
2023-07-25 22:10         ` Yonghong Song
2023-07-25 22:26           ` Jose E. Marchesi
2023-07-26  0:31             ` Alexei Starovoitov
2023-07-26  0:39               ` Eduard Zingerman
2023-07-26  4:16                 ` Yonghong Song
2023-07-26 14:41                   ` Eduard Zingerman
2023-07-28 16:58                   ` Eduard Zingerman
2023-07-28 21:29                     ` Alexei Starovoitov
2023-07-28 23:25                     ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zg3jah2s.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=yhs@meta.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox