Re: [Bpf] [PATCH bpf-next v2] bpf, docs: Add explanation of endianness

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Dave Thaler <dthaler1968=40googlemail.com@dmarc.ietf.org>,
	bpf <bpf@vger.kernel.org>,
	bpf@ietf.org, Dave Thaler <dthaler@microsoft.com>,
	David Vernet <void@manifault.com>
Subject: Re: [Bpf] [PATCH bpf-next v2] bpf, docs: Add explanation of endianness
Date: Thu, 23 Feb 2023 00:23:41 +0100	[thread overview]
Message-ID: <87ttzdwagy.fsf@oracle.com> (raw)
In-Reply-To: <CAADnVQ++hR7Cj3OXGLWpV_=4MnFndq5qS8r5b-YYPC_OB=gjQg@mail.gmail.com> (Alexei Starovoitov's message of "Wed, 22 Feb 2023 14:10:49 -0800")


> On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
> <dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
>>
>> From: Dave Thaler <dthaler@microsoft.com>
>>
>> Document the discussion from the email thread on the IETF bpf list,
>> where it was explained that the raw format varies by endianness
>> of the processor.
>>
>> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
>>
>> Acked-by: David Vernet <void@manifault.com>
>> ---
>>
>> V1 -> V2: rebased on top of latest master
>> ---
>>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
>> index af515de5fc3..1d473f060fa 100644
>> --- a/Documentation/bpf/instruction-set.rst
>> +++ b/Documentation/bpf/instruction-set.rst
>> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
>>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
>>    constant) value after the basic instruction for a total of 128 bits.
>>
>> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
>> -bits and least significant bits, respectively:
>> +The basic instruction encoding looks as follows for a little-endian processor,
>> +where MSB and LSB mean the most significant bits and least significant bits,
>> +respectively:
>>
>>  =============  =======  =======  =======  ============
>>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
>> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
>>  **opcode**
>>    operation to perform
>>
>> +and as follows for a big-endian processor:
>> +
>> +=============  =======  ====================  ===============  ============
>> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
>> +=============  =======  ====================  ===============  ============
>> +immediate      offset   destination register  source register  opcode
>> +=============  =======  ====================  ===============  ============
>
> I've changed it to:
> imm            offset   dst_reg  src_reg  opcode
>
> to match the little endian table,
> but now one of the tables feels wrong.
> The encoding is always done by applying C standard to the struct:
> struct bpf_insn {
>         __u8    code;           /* opcode */
>         __u8    dst_reg:4;      /* dest register */
>         __u8    src_reg:4;      /* source register */
>         __s16   off;            /* signed offset */
>         __s32   imm;            /* signed immediate constant */
> };
> I'm not sure how to express this clearly in the table.

Perhaps it would be simpler to document how the instruction bytes are
stored (be it in an ELF file or as bytes in a memory buffer to be loaded
into the kernel or some other BPF consumer) as opposed to how the
instructions look like once loaded (as a 64-bit word) by a little-endian
or big-endian kernel?

Stored little-endian BPF instructions:

  code src_reg dst_reg off imm

  foo-le.o:     file format elf64-bpfle

  0000000000000000 <.text>:
     0:   07 01 00 00 ef be ad de         r1 += 0xdeadbeef

Stored big-endian BPF instructions:

  code dst_reg src_reg off imm

  foo-be.o:     file format elf64-bpfbe

  0000000000000000 <.text>:
     0:   07 10 00 00 de ad be ef         r1 += 0xdeadbeef

i.e. in the stored bytes the code always comes first, then the
registers, then the offset, then the immediate, regardless of
endianness.

This may be easier to understand by implementors looking to generate
and/or consume bytes conforming BPF instructions.

next prev parent reply	other threads:[~2023-02-22 23:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-20 22:37 [PATCH bpf-next v2] bpf, docs: Add explanation of endianness Dave Thaler
2023-02-22 22:10 ` patchwork-bot+netdevbpf
2023-02-22 22:10 ` [Bpf] " Alexei Starovoitov
2023-02-22 23:23   ` Jose E. Marchesi [this message]
2023-02-23  1:56     ` Alexei Starovoitov
2023-02-23 13:18       ` Jose E. Marchesi
2023-02-23 16:40         ` Alexei Starovoitov
2023-02-23 16:42           ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttzdwagy.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@ietf.org \
    --cc=bpf@vger.kernel.org \
    --cc=dthaler1968=40googlemail.com@dmarc.ietf.org \
    --cc=dthaler@microsoft.com \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).