bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Vernet <void@manifault.com>
To: dthaler1968@googlemail.com
Cc: bpf@ietf.org, bpf@vger.kernel.org, jose.marchesi@oracle.com
Subject: Re: [Bpf] Standardizing BPF assembly language?
Date: Tue, 23 Jan 2024 15:52:14 -0600	[thread overview]
Message-ID: <20240123215214.GC221862@maniforge> (raw)
In-Reply-To: <1e9101da4e44$e24a1720$a6de4560$@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4946 bytes --]

On Tue, Jan 23, 2024 at 01:41:10PM -0800, dthaler1968@googlemail.com wrote:
> > -----Original Message-----
> > From: David Vernet <void@manifault.com>
> > Sent: Tuesday, January 23, 2024 1:31 PM
> > To: dthaler1968@googlemail.com
> > Cc: bpf@ietf.org; bpf@vger.kernel.org; jose.marchesi@oracle.com
> > Subject: Re: [Bpf] Standardizing BPF assembly language?
> > 
> > On Tue, Jan 23, 2024 at 08:45:32AM -0800,
> > dthaler1968=40googlemail.com@dmarc.ietf.org wrote:
> > > At LSF/MM/BPF 2023, Jose gave a presentation about BPF assembly
> > > language (http://vger.kernel.org/bpfconf2023_material/compiled_bpf.txt).
> > >
> > > Jose wrote in that link:
> > > > There are two dialects of BPF assembler in use today:
> > > >
> > > > - A "pseudo-c" dialect (originally "BPF verifier format")
> > > >  : r1 = *(u64 *)(r2 + 0x00f0)
> > > >  : if r1 > 2 goto label
> > > >  : lock *(u32 *)(r2 + 10) += r3
> > > >
> > > > - An "assembler-like" dialect
> > > >  : ldxdw %r1, [%r2 + 0x00f0]
> > > >  : jgt %r1, 2, label
> > > >  : xaddw [%r2 + 2], r3
> > >
> > > During Jose's talk, I discovered that uBPF didn't quote match the
> > > second dialect and submitted a bug report.  By the time the conference
> > > was over, uBPF had been updated to match GCC, so that discussion
> > > worked to reduce the number of variants.
> > >
> > > As more instructions get added and supported by more tools and
> > > compilers there's the risk of even more variants unless it's
> standardized.
> > >
> > > Hence I'd recommend that BPF assembly language get documented in some
> > > WG draft.  If folks agree with that premise, the first question is
> > > then: which document?
> > 
> > > One possible answer would be the ISA document that specifies the
> > > instructions, since that would the IANA registry could list the
> > > assembly for each instruction, and any future documents that add
> > > instructions would necessarily need to specify the assembly for them,
> > > preventing variants from springing up for new instructions.
> > 
> > I'm not opposed to this, but would strongly prefer that we do it as an
> extension
> > if we go this route to avoid scope creep for the first iteration.
> 
> If the first iteration does not have it, then presumably the initial
> IANA registry would not have it either, since this iteration creates
> the registry and the rules for it.
> 
> That's doable, but may continue to proliferate more and more variants
> until it is addressed.

The same could be said for any new instructions that are added while we
sort out standardizing the assembly language as well, no?

> If it's in another document, do you agree it would still fall under
> the existing charter bullet about "defining the instructions"
> > [PS] the BPF instruction set architecture (ISA) that defines the
> > instructions and low-level virtual machine for BPF programs,
> ?

I wouldn't say it's illogical to group assembly language in this bucket,
but I would say that defining the assembly language does not need to be
tied at the hip with defining instruction encodings and semantics. So my
answer is "yes, I think it belongs here", but I also don't think it's
necessary or desirable for the first iteration.

> > > A second question would be, which dialect(s) to standardize.  Jose's
> > > link above argues that the second dialect should be the one
> > > standardized (tools are free to support multiple dialects for
> > > backwards compat if they want).  See the link for rationale.
> > 
> > My recollection was that the outcome of that discussion is that we were
> going
> > to continue to support both. If we wanted to standardize, I have a hard
> time
> > seeing any other way other than to standardize both dialects unless
> there's
> > been a significant change in sentiment since LSFMM.
> 
> If "standardize both", does that mean neither is mandatory and each tool
> is free to pick one or the other?  And would the IANA registry require a
> document
> adding any new instructions to specify the assembly in both dialects?

Well, if we're standardizing on both, then yes I think it would be
mandatory for a tool to support both, and I think instructions would
require assembly for both dialects. Practically speaking that's already
what's happening, no? Both dialects are already pervasive, so it seems
unlikely that a tool would succeed without supporting both regardless.
To Jose's point (pasted below), there are of course drawbacks:

> - Expensive :: it makes it very difficult to reuse infrastructure.
> - Problematic :: dis/assemblers, CGEN, LaTeX, editors, IDEs, etc.
> - Ambiguous :: with both GAS and llvm/MCParser: symbol assignments.
> - Pervasive :: because of the inline asm.

I think it would be a lot simpler to standardize on only a single
dialect, but I also think the standard should reflect how BPF is being
used in practice.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: David Vernet <void@manifault.com>
To: dthaler1968@googlemail.com
Cc: bpf@ietf.org, bpf@vger.kernel.org, jose.marchesi@oracle.com
Subject: Re: [Bpf] Standardizing BPF assembly language?
Date: Tue, 23 Jan 2024 15:52:14 -0600	[thread overview]
Message-ID: <20240123215214.GC221862@maniforge> (raw)
Message-ID: <20240123215214.bt8Kn8AfJ7bjbeFEVifPANiUACj1b_GE9X4JzC4qALE@z> (raw)
In-Reply-To: <1e9101da4e44$e24a1720$a6de4560$@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 4946 bytes --]

On Tue, Jan 23, 2024 at 01:41:10PM -0800, dthaler1968@googlemail.com wrote:
> > -----Original Message-----
> > From: David Vernet <void@manifault.com>
> > Sent: Tuesday, January 23, 2024 1:31 PM
> > To: dthaler1968@googlemail.com
> > Cc: bpf@ietf.org; bpf@vger.kernel.org; jose.marchesi@oracle.com
> > Subject: Re: [Bpf] Standardizing BPF assembly language?
> > 
> > On Tue, Jan 23, 2024 at 08:45:32AM -0800,
> > dthaler1968=40googlemail.com@dmarc.ietf.org wrote:
> > > At LSF/MM/BPF 2023, Jose gave a presentation about BPF assembly
> > > language (http://vger.kernel.org/bpfconf2023_material/compiled_bpf.txt).
> > >
> > > Jose wrote in that link:
> > > > There are two dialects of BPF assembler in use today:
> > > >
> > > > - A "pseudo-c" dialect (originally "BPF verifier format")
> > > >  : r1 = *(u64 *)(r2 + 0x00f0)
> > > >  : if r1 > 2 goto label
> > > >  : lock *(u32 *)(r2 + 10) += r3
> > > >
> > > > - An "assembler-like" dialect
> > > >  : ldxdw %r1, [%r2 + 0x00f0]
> > > >  : jgt %r1, 2, label
> > > >  : xaddw [%r2 + 2], r3
> > >
> > > During Jose's talk, I discovered that uBPF didn't quote match the
> > > second dialect and submitted a bug report.  By the time the conference
> > > was over, uBPF had been updated to match GCC, so that discussion
> > > worked to reduce the number of variants.
> > >
> > > As more instructions get added and supported by more tools and
> > > compilers there's the risk of even more variants unless it's
> standardized.
> > >
> > > Hence I'd recommend that BPF assembly language get documented in some
> > > WG draft.  If folks agree with that premise, the first question is
> > > then: which document?
> > 
> > > One possible answer would be the ISA document that specifies the
> > > instructions, since that would the IANA registry could list the
> > > assembly for each instruction, and any future documents that add
> > > instructions would necessarily need to specify the assembly for them,
> > > preventing variants from springing up for new instructions.
> > 
> > I'm not opposed to this, but would strongly prefer that we do it as an
> extension
> > if we go this route to avoid scope creep for the first iteration.
> 
> If the first iteration does not have it, then presumably the initial
> IANA registry would not have it either, since this iteration creates
> the registry and the rules for it.
> 
> That's doable, but may continue to proliferate more and more variants
> until it is addressed.

The same could be said for any new instructions that are added while we
sort out standardizing the assembly language as well, no?

> If it's in another document, do you agree it would still fall under
> the existing charter bullet about "defining the instructions"
> > [PS] the BPF instruction set architecture (ISA) that defines the
> > instructions and low-level virtual machine for BPF programs,
> ?

I wouldn't say it's illogical to group assembly language in this bucket,
but I would say that defining the assembly language does not need to be
tied at the hip with defining instruction encodings and semantics. So my
answer is "yes, I think it belongs here", but I also don't think it's
necessary or desirable for the first iteration.

> > > A second question would be, which dialect(s) to standardize.  Jose's
> > > link above argues that the second dialect should be the one
> > > standardized (tools are free to support multiple dialects for
> > > backwards compat if they want).  See the link for rationale.
> > 
> > My recollection was that the outcome of that discussion is that we were
> going
> > to continue to support both. If we wanted to standardize, I have a hard
> time
> > seeing any other way other than to standardize both dialects unless
> there's
> > been a significant change in sentiment since LSFMM.
> 
> If "standardize both", does that mean neither is mandatory and each tool
> is free to pick one or the other?  And would the IANA registry require a
> document
> adding any new instructions to specify the assembly in both dialects?

Well, if we're standardizing on both, then yes I think it would be
mandatory for a tool to support both, and I think instructions would
require assembly for both dialects. Practically speaking that's already
what's happening, no? Both dialects are already pervasive, so it seems
unlikely that a tool would succeed without supporting both regardless.
To Jose's point (pasted below), there are of course drawbacks:

> - Expensive :: it makes it very difficult to reuse infrastructure.
> - Problematic :: dis/assemblers, CGEN, LaTeX, editors, IDEs, etc.
> - Ambiguous :: with both GAS and llvm/MCParser: symbol assignments.
> - Pervasive :: because of the inline asm.

I think it would be a lot simpler to standardize on only a single
dialect, but I also think the standard should reflect how BPF is being
used in practice.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 76 bytes --]

-- 
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf

  parent reply	other threads:[~2024-01-23 21:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23 16:45 Standardizing BPF assembly language? dthaler1968
2024-01-23 16:45 ` [Bpf] " dthaler1968=40googlemail.com
2024-01-23 21:31 ` David Vernet
2024-01-23 21:31   ` David Vernet
2024-01-23 21:41   ` dthaler1968
2024-01-23 21:41     ` dthaler1968=40googlemail.com
2024-01-23 21:52     ` David Vernet [this message]
2024-01-23 21:52       ` David Vernet
2024-01-23 23:15       ` dthaler1968
2024-01-23 23:15         ` dthaler1968=40googlemail.com
2024-01-25  2:51       ` Alexei Starovoitov
2024-01-25  2:51         ` Alexei Starovoitov
2024-01-27  5:29         ` David Vernet
2024-01-27  5:29           ` David Vernet
2024-01-25  3:13 ` Watson Ladd
2024-01-25  3:13   ` Watson Ladd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240123215214.GC221862@maniforge \
    --to=void@manifault.com \
    --cc=bpf@ietf.org \
    --cc=bpf@vger.kernel.org \
    --cc=dthaler1968@googlemail.com \
    --cc=jose.marchesi@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).