From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Dave Thaler <dthaler@microsoft.com>
Cc: bpf <bpf@vger.kernel.org>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
"bpf@ietf.org" <bpf@ietf.org>, David Vernet <void@manifault.com>
Subject: Re: [Bpf] [PATCH V3] bpf, docs: Document BPF insn encoding in term of stored bytes
Date: Tue, 28 Feb 2023 02:05:03 +0100 [thread overview]
Message-ID: <87v8jm7g74.fsf@oracle.com> (raw)
In-Reply-To: <PH7PR21MB3878F2AF288BE7671D61E257A3AF9@PH7PR21MB3878.namprd21.prod.outlook.com> (Dave Thaler's message of "Mon, 27 Feb 2023 21:49:15 +0000")
>> -----Original Message-----
>> From: Bpf <bpf-bounces@ietf.org> On Behalf Of Jose E. Marchesi
>> Sent: Monday, February 27, 2023 1:06 PM
>> To: bpf <bpf@vger.kernel.org>
>> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>; bpf@ietf.org; David
>> Vernet <void@manifault.com>
>> Subject: [Bpf] [PATCH V3] bpf, docs: Document BPF insn encoding in term of
>> stored bytes
>>
>>
>> [Changes from V2:
>> - Use src and dst consistently in the document.
>
> Since my earlier patch, src and dst refer to the values whereas
> src_reg and dst_reg refer to register numbers.
Oh, I didn't realize that!
Yeah sure, resending with src_reg and dst_reg.
>> - Use a more graphical depiction of the 128-bit instruction.
>> - Remove `Where:' fragment.
>> - Clarify that unused bits are reserved and shall be zeroed.]
>>
>> This patch modifies instruction-set.rst so it documents the encoding of BPF
>> instructions in terms of how the bytes are stored (be it in an ELF file or as
>> bytes in a memory buffer to be loaded into the kernel or some other BPF
>> consumer) as opposed to how the instruction looks like once loaded.
>>
>> This is hopefully easier to understand by implementors looking to generate
>> and/or consume bytes conforming BPF instructions.
>>
>> The patch also clarifies that the unused bytes in a pseudo-instruction shall be
>> cleared with zeros.
>>
>> Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
>> ---
>> Documentation/bpf/instruction-set.rst | 63 ++++++++++++++-------------
>> 1 file changed, 33 insertions(+), 30 deletions(-)
>>
>> diff --git a/Documentation/bpf/instruction-set.rst
>> b/Documentation/bpf/instruction-set.rst
>> index 01802ed9b29b..fae2e48d6a0b 100644
>> --- a/Documentation/bpf/instruction-set.rst
>> +++ b/Documentation/bpf/instruction-set.rst
>> @@ -38,15 +38,11 @@ eBPF has two instruction encodings:
>> * the wide instruction encoding, which appends a second 64-bit immediate
>> (i.e.,
>> constant) value after the basic instruction for a total of 128 bits.
>>
>> -The basic instruction encoding looks as follows for a little-endian processor,
>> -where MSB and LSB mean the most significant bits and least significant bits,
>> -respectively:
>> +The fields conforming an encoded basic instruction are stored in the
>> +following order::
>>
>> -============= ======= ======= ======= ============
>> -32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
>> -============= ======= ======= ======= ============
>> -imm offset src_reg dst_reg opcode
>> -============= ======= ======= ======= ============
>> + opcode:8 src:4 dst:4 offset:16 imm:32 // In little-endian BPF.
>> + opcode:8 dst:4 src:4 offset:16 imm:32 // In big-endian BPF.
>
> I think those should be src_reg and dst_reg (as the register numbers)
> not src and dst (which are the values) or this will be a documentation
> regression.
>
> Right now I think this is a regression since if I understand right, with this
> patch, "src" and "dst" now refer to both in different places which is
> confusing.
>
> Dave
>
>> **imm**
>> signed integer immediate value
>> @@ -54,48 +50,55 @@ imm offset src_reg dst_reg opcode
>> **offset**
>> signed integer offset used with pointer arithmetic
>>
>> -**src_reg**
>> +**src**
>> the source register number (0-10), except where otherwise specified
>> (`64-bit immediate instructions`_ reuse this field for other purposes)
>>
>> -**dst_reg**
>> +**dst**
>> destination register number (0-10)
>>
>> **opcode**
>> operation to perform
>>
>> -and as follows for a big-endian processor:
>> +Note that the contents of multi-byte fields ('imm' and 'offset') are
>> +stored using big-endian byte ordering in big-endian BPF and
>> +little-endian byte ordering in little-endian BPF.
>>
>> -============= ======= ======= ======= ============
>> -32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
>> -============= ======= ======= ======= ============
>> -imm offset dst_reg src_reg opcode
>> -============= ======= ======= ======= ============
>> +For example::
>>
>> -Multi-byte fields ('imm' and 'offset') are similarly stored in -the byte order of
>> the processor.
>> + opcode offset imm assembly
>> + src dst
>> + 07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little
>> + dst src
>> + 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big
>>
>> Note that most instructions do not use all of the fields.
>> Unused fields shall be cleared to zero.
>>
>> -As discussed below in `64-bit immediate instructions`_, a 64-bit immediate -
>> instruction uses a 64-bit immediate value that is constructed as follows.
>> -The 64 bits following the basic instruction contain a pseudo instruction -
>> using the same format but with opcode, dst_reg, src_reg, and offset all set to
>> zero, -and imm containing the high 32 bits of the immediate value.
>> +As discussed below in `64-bit immediate instructions`_, a 64-bit
>> +immediate instruction uses a 64-bit immediate value that is constructed
>> +as follows. The 64 bits following the basic instruction contain a
>> +pseudo instruction using the same format but with opcode, dst, src, and
>> +offset all set to zero, and imm containing the high 32 bits of the
>> +immediate value.
>>
>> -================= ==================
>> -64 bits (MSB) 64 bits (LSB)
>> -================= ==================
>> -basic instruction pseudo instruction
>> -================= ==================
>> +This is depicted in the following figure::
>> +
>> + basic_instruction
>> + .-----------------------------.
>> + | |
>> + code:8 regs:16 offset:16 imm:32 unused:32 imm:32
>> + | |
>> + '--------------'
>> + pseudo instruction
>>
>> Thus the 64-bit immediate value is constructed as follows:
>>
>> imm64 = (next_imm << 32) | imm
>>
>> where 'next_imm' refers to the imm value of the pseudo instruction -
>> following the basic instruction.
>> +following the basic instruction. The unused bytes in the pseudo
>> +instruction are reserved and shall be cleared to zero.
>>
>> Instruction classes
>> -------------------
>> @@ -137,7 +140,7 @@ code source instruction class
>> source value description
>> ====== ===== ==============================================
>> BPF_K 0x00 use 32-bit 'imm' value as source operand
>> - BPF_X 0x08 use 'src_reg' register value as source operand
>> + BPF_X 0x08 use 'src' register value as source operand
>> ====== ===== ==============================================
>>
>> **instruction class**
>> --
>> 2.30.2
>>
>> --
>> Bpf mailing list
>> Bpf@ietf.org
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
>> .ietf.org%2Fmailman%2Flistinfo%2Fbpf&data=05%7C01%7Cdthaler%40micro
>> soft.com%7C65d83bf2fe834f73f84908db19067400%7C72f988bf86f141af91ab
>> 2d7cd011db47%7C1%7C0%7C638131287757978381%7CUnknown%7CTWFpb
>> GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
>> Mn0%3D%7C3000%7C%7C%7C&sdata=8il1%2B8I1T8GBqn3U%2B7YJehIKjS6s
>> gvxTRWS2CTpg%2FZY%3D&reserved=0
prev parent reply other threads:[~2023-02-28 1:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-27 21:05 [PATCH V3] bpf, docs: Document BPF insn encoding in term of stored bytes Jose E. Marchesi
2023-02-27 21:18 ` David Vernet
2023-02-27 21:49 ` [Bpf] " Dave Thaler
2023-02-27 23:42 ` David Vernet
2023-02-28 1:05 ` Jose E. Marchesi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v8jm7g74.fsf@oracle.com \
--to=jose.marchesi@oracle.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@ietf.org \
--cc=bpf@vger.kernel.org \
--cc=dthaler@microsoft.com \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox