public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: David Vernet <void@manifault.com>
To: Dave Thaler <dthaler@microsoft.com>
Cc: "Jose E. Marchesi" <jose.marchesi@oracle.com>,
	bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	"bpf@ietf.org" <bpf@ietf.org>
Subject: Re: [Bpf] [PATCH V3] bpf, docs: Document BPF insn encoding in term of stored bytes
Date: Mon, 27 Feb 2023 17:42:18 -0600	[thread overview]
Message-ID: <Y/0/2pCw7d9b3Ji/@maniforge> (raw)
In-Reply-To: <PH7PR21MB3878F2AF288BE7671D61E257A3AF9@PH7PR21MB3878.namprd21.prod.outlook.com>

On Mon, Feb 27, 2023 at 09:49:15PM +0000, Dave Thaler wrote:
> > -----Original Message-----
> > From: Bpf <bpf-bounces@ietf.org> On Behalf Of Jose E. Marchesi
> > Sent: Monday, February 27, 2023 1:06 PM
> > To: bpf <bpf@vger.kernel.org>
> > Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>; bpf@ietf.org; David
> > Vernet <void@manifault.com>
> > Subject: [Bpf] [PATCH V3] bpf, docs: Document BPF insn encoding in term of
> > stored bytes
> > 
> > 
> > [Changes from V2:
> > - Use src and dst consistently in the document.
> 
> Since my earlier patch, src and dst refer to the values whereas
> src_reg and dst_reg refer to register numbers.
> 
> > - Use a more graphical depiction of the 128-bit instruction.
> > - Remove `Where:' fragment.
> > - Clarify that unused bits are reserved and shall be zeroed.]
> > 
> > This patch modifies instruction-set.rst so it documents the encoding of BPF
> > instructions in terms of how the bytes are stored (be it in an ELF file or as
> > bytes in a memory buffer to be loaded into the kernel or some other BPF
> > consumer) as opposed to how the instruction looks like once loaded.
> > 
> > This is hopefully easier to understand by implementors looking to generate
> > and/or consume bytes conforming BPF instructions.
> > 
> > The patch also clarifies that the unused bytes in a pseudo-instruction shall be
> > cleared with zeros.
> > 
> > Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
> > ---
> >  Documentation/bpf/instruction-set.rst | 63 ++++++++++++++-------------
> >  1 file changed, 33 insertions(+), 30 deletions(-)
> > 
> > diff --git a/Documentation/bpf/instruction-set.rst
> > b/Documentation/bpf/instruction-set.rst
> > index 01802ed9b29b..fae2e48d6a0b 100644
> > --- a/Documentation/bpf/instruction-set.rst
> > +++ b/Documentation/bpf/instruction-set.rst
> > @@ -38,15 +38,11 @@ eBPF has two instruction encodings:
> >  * the wide instruction encoding, which appends a second 64-bit immediate
> > (i.e.,
> >    constant) value after the basic instruction for a total of 128 bits.
> > 
> > -The basic instruction encoding looks as follows for a little-endian processor,
> > -where MSB and LSB mean the most significant bits and least significant bits,
> > -respectively:
> > +The fields conforming an encoded basic instruction are stored in the
> > +following order::
> > 
> > -=============  =======  =======  =======  ============
> > -32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
> > -=============  =======  =======  =======  ============
> > -imm            offset   src_reg  dst_reg  opcode
> > -=============  =======  =======  =======  ============
> > +  opcode:8 src:4 dst:4 offset:16 imm:32 // In little-endian BPF.
> > +  opcode:8 dst:4 src:4 offset:16 imm:32 // In big-endian BPF.
> 
> I think those should be src_reg and dst_reg (as the register numbers)
> not src and dst (which are the values) or this will be a documentation
> regression.
> 
> Right now I think this is a regression since if I understand right, with this
> patch, "src" and "dst" now refer to both in different places which is
> confusing.

Fair enough -- this was my suggestion, and in hindsight I agree that it
would probably be best to avoid ambiguity by using src_reg and dst_reg
here. Apologies for the churn, Jose.

> 
> Dave
> 
> >  **imm**
> >    signed integer immediate value
> > @@ -54,48 +50,55 @@ imm            offset   src_reg  dst_reg  opcode
> >  **offset**
> >    signed integer offset used with pointer arithmetic
> > 
> > -**src_reg**
> > +**src**
> >    the source register number (0-10), except where otherwise specified
> >    (`64-bit immediate instructions`_ reuse this field for other purposes)
> > 
> > -**dst_reg**
> > +**dst**
> >    destination register number (0-10)
> > 
> >  **opcode**
> >    operation to perform
> > 
> > -and as follows for a big-endian processor:
> > +Note that the contents of multi-byte fields ('imm' and 'offset') are
> > +stored using big-endian byte ordering in big-endian BPF and
> > +little-endian byte ordering in little-endian BPF.
> > 
> > -=============  =======  =======  =======  ============
> > -32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
> > -=============  =======  =======  =======  ============
> > -imm            offset   dst_reg  src_reg  opcode
> > -=============  =======  =======  =======  ============
> > +For example::
> > 
> > -Multi-byte fields ('imm' and 'offset') are similarly stored in -the byte order of
> > the processor.
> > +  opcode         offset imm          assembly
> > +         src dst
> > +  07     0   1   00 00  44 33 22 11  r1 += 0x11223344 // little
> > +         dst src
> > +  07     1   0   00 00  11 22 33 44  r1 += 0x11223344 // big
> > 
> >  Note that most instructions do not use all of the fields.
> >  Unused fields shall be cleared to zero.
> > 
> > -As discussed below in `64-bit immediate instructions`_, a 64-bit immediate -
> > instruction uses a 64-bit immediate value that is constructed as follows.
> > -The 64 bits following the basic instruction contain a pseudo instruction -
> > using the same format but with opcode, dst_reg, src_reg, and offset all set to
> > zero, -and imm containing the high 32 bits of the immediate value.
> > +As discussed below in `64-bit immediate instructions`_, a 64-bit
> > +immediate instruction uses a 64-bit immediate value that is constructed
> > +as follows.  The 64 bits following the basic instruction contain a
> > +pseudo instruction using the same format but with opcode, dst, src, and
> > +offset all set to zero, and imm containing the high 32 bits of the
> > +immediate value.
> > 
> > -=================  ==================
> > -64 bits (MSB)      64 bits (LSB)
> > -=================  ==================
> > -basic instruction  pseudo instruction
> > -=================  ==================
> > +This is depicted in the following figure::
> > +
> > +        basic_instruction
> > +  .-----------------------------.
> > +  |                             |
> > +  code:8 regs:16 offset:16 imm:32 unused:32 imm:32
> > +                                  |              |
> > +                                  '--------------'
> > +                                 pseudo instruction
> > 
> >  Thus the 64-bit immediate value is constructed as follows:
> > 
> >    imm64 = (next_imm << 32) | imm
> > 
> >  where 'next_imm' refers to the imm value of the pseudo instruction -
> > following the basic instruction.
> > +following the basic instruction.  The unused bytes in the pseudo
> > +instruction are reserved and shall be cleared to zero.
> > 
> >  Instruction classes
> >  -------------------
> > @@ -137,7 +140,7 @@ code            source  instruction class
> >    source  value  description
> >    ======  =====  ==============================================
> >    BPF_K   0x00   use 32-bit 'imm' value as source operand
> > -  BPF_X   0x08   use 'src_reg' register value as source operand
> > +  BPF_X   0x08   use 'src' register value as source operand
> >    ======  =====  ==============================================
> > 
> >  **instruction class**
> > --
> > 2.30.2
> > 
> > --
> > Bpf mailing list
> > Bpf@ietf.org
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> > .ietf.org%2Fmailman%2Flistinfo%2Fbpf&data=05%7C01%7Cdthaler%40micro
> > soft.com%7C65d83bf2fe834f73f84908db19067400%7C72f988bf86f141af91ab
> > 2d7cd011db47%7C1%7C0%7C638131287757978381%7CUnknown%7CTWFpb
> > GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
> > Mn0%3D%7C3000%7C%7C%7C&sdata=8il1%2B8I1T8GBqn3U%2B7YJehIKjS6s
> > gvxTRWS2CTpg%2FZY%3D&reserved=0

  reply	other threads:[~2023-02-27 23:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-27 21:05 [PATCH V3] bpf, docs: Document BPF insn encoding in term of stored bytes Jose E. Marchesi
2023-02-27 21:18 ` David Vernet
2023-02-27 21:49 ` [Bpf] " Dave Thaler
2023-02-27 23:42   ` David Vernet [this message]
2023-02-28  1:05   ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y/0/2pCw7d9b3Ji/@maniforge \
    --to=void@manifault.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@ietf.org \
    --cc=bpf@vger.kernel.org \
    --cc=dthaler@microsoft.com \
    --cc=jose.marchesi@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox