public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Dave Thaler <dthaler=40microsoft.com@dmarc.ietf.org>
Cc: bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	"bpf@ietf.org" <bpf@ietf.org>
Subject: Re: [Bpf] [PATCH] bpf, docs: Document BPF insn encoding in term of stored bytes
Date: Fri, 24 Feb 2023 21:59:24 +0100	[thread overview]
Message-ID: <87h6va230z.fsf@oracle.com> (raw)
In-Reply-To: <PH7PR21MB3878B8C1197ACE5318E332A8A3A89@PH7PR21MB3878.namprd21.prod.outlook.com> (Dave Thaler's message of "Fri, 24 Feb 2023 20:44:09 +0000")


>> -----Original Message-----
>> From: Bpf <bpf-bounces@ietf.org> On Behalf Of Jose E. Marchesi
>> Sent: Friday, February 24, 2023 12:04 PM
>> To: bpf <bpf@vger.kernel.org>
>> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>; bpf@ietf.org
>> Subject: [Bpf] [PATCH] bpf, docs: Document BPF insn encoding in term of
>> stored bytes
>> 
>> 
>> This patch modifies instruction-set.rst so it documents the encoding of BPF
>> instructions in terms of how the bytes are stored (be it in an ELF file or as
>> bytes in a memory buffer to be loaded into the kernel or some other BPF
>> consumer) as opposed to how the instruction looks like once loaded.
>> 
>> This is hopefully easier to understand by implementors looking to generate
>> and/or consume bytes conforming BPF instructions.
>> 
>> The patch also clarifies that the unused bytes in a pseudo-instruction shall be
>> cleared with zeros.
>> 
>> Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
>> ---
>>  Documentation/bpf/instruction-set.rst | 43 +++++++++++++--------------
>>  1 file changed, 21 insertions(+), 22 deletions(-)
>> 
>> diff --git a/Documentation/bpf/instruction-set.rst
>> b/Documentation/bpf/instruction-set.rst
>> index 01802ed9b29b..9b28c0e15bb6 100644
>> --- a/Documentation/bpf/instruction-set.rst
>> +++ b/Documentation/bpf/instruction-set.rst
>> @@ -38,15 +38,13 @@ eBPF has two instruction encodings:
>>  * the wide instruction encoding, which appends a second 64-bit immediate
>> (i.e.,
>>    constant) value after the basic instruction for a total of 128 bits.
>> 
>> -The basic instruction encoding looks as follows for a little-endian processor,
>> -where MSB and LSB mean the most significant bits and least significant bits,
>> -respectively:
>> +The fields conforming an encoded basic instruction are stored in the
>> +following order:
>> 
>> -=============  =======  =======  =======  ============
>> -32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
>> -=============  =======  =======  =======  ============
>> -imm            offset   src_reg  dst_reg  opcode
>> -=============  =======  =======  =======  ============
>> +  opcode:8 src:4 dst:4 offset:16 imm:32 // In little-endian BPF.
>> +  opcode:8 dst:4 src:4 offset:16 imm:32 // In big-endian BPF.
>
> Personally I find this notation harder to understand in general.
> For example, it encodes (without explanation) the C language
> assumption that "//" is a comment, ":" indicates a bit width,
> and the fields are in order from most significate byte to least
> significant byte.  The text before this change has no such
> unexplained assumptions. 

The fields are not ordered from "most significative byte" to "least
significative byte".  The fields are ordered as they are stored.  Thats
the whole point of the patch.

As for //, :N and | below, I think these signs are obvious enough to not
require further explanation, but I wouldn't mind to use some other
better notation, if you can suggest one. I am not a very graphical
person myself.

>
> [...]
>> -Multi-byte fields ('imm' and 'offset') are similarly stored in -the byte order of
>> the processor.
>> +  opcode         offset imm          assembly
>> +         src dst
>> +  07     0   1   00 00  44 33 22 11  r1 += 0x11223344 // little
>> +         dst src
>> +  07     1   0   00 00  11 22 33 44  r1 += 0x11223344 // big
>
> Similar assumption without explanation of "//" meaning comment, and
> some implied tabular formatting without being an actual table?

It is intended to be a diagram, not a table.  I used indentation which
AFAIK is the rst way to denote multi-line verbatim environments... is
that wrong for ascii-art diagrams?

> [...]
>> -=================  ==================
>> -64 bits (MSB)      64 bits (LSB)
>> -=================  ==================
>> -basic instruction  pseudo instruction
>> -=================  ==================
>> +This is depicted in the following figure:
>> +
>> +  basic_instruction                 pseudo_instruction
>> +  code:8 regs:16 offset:16 imm:32 | unused:32 imm:32
>
> And here the use of "|" above I find confusing.
>
> What do others think?
>
> Dave

  reply	other threads:[~2023-02-24 20:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-24 20:04 [PATCH] bpf, docs: Document BPF insn encoding in term of stored bytes Jose E. Marchesi
2023-02-24 20:44 ` [Bpf] " Dave Thaler
2023-02-24 20:59   ` Jose E. Marchesi [this message]
2023-02-25  0:20     ` Alexei Starovoitov
2023-02-25 13:07 ` kernel test robot
2023-02-27 16:52 ` [Bpf] " David Vernet
2023-02-27 18:05   ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h6va230z.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@ietf.org \
    --cc=bpf@vger.kernel.org \
    --cc=dthaler=40microsoft.com@dmarc.ietf.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox