From: dthaler1968@googlemail.com
To: "'David Vernet'" <void@manifault.com>
Cc: <bpf@ietf.org>, "'bpf'" <bpf@vger.kernel.org>
Subject: RE: [Bpf] BPF ISA conformance groups
Date: Tue, 12 Dec 2023 14:01:09 -0800 [thread overview]
Message-ID: <157b01da2d46$b7453e20$25cfba60$@gmail.com> (raw)
In-Reply-To: <20231212214532.GB1222@maniforge>
David Vernet <void@manifault.com> wrote:
[...]
> > > > A given runtime platform would be compliant to some set of the
> > > > following conformance groups:
> > > >
> > > > 1. "basic": all instructions not covered by another group below.
> > > > 2. "atomic": all Atomic operations. I think Christoph argued for
> > > > this one in the meeting.
> > > > 3. "divide": all division and modulo operations. Alexei said in
> > > > the meeting that he'd heard demand for this one.
> > > > 4. "legacy": all legacy packet access instructions (deprecated).
> > > > 5. "map": 64-bit immediate instructions that deal with map fds or
> > > > map indices.
> > > > 6. "code": 64-bit immediate instruction that has a "code pointer" type.
> > > > 7. "func": program-local functions.
> > >
> > > I thought for a while about whether this should be part of the basic
> > > conformance group, and talked through it with Jakub Kicinski. I do
> > > think it makes sense to keep it separate like this. For e.g. devices
> > > with Harvard architectures, it could get quite non-trivial for the
> > > verifier to determine whether accesses to arguments stored in
> > > special register are safe. Definitely not impossible, and overall
> > > very useful to support this, but in order to ease vendor adoption
> > > it's probably best to keep this separate.
> > >
> > > > Things that I *think* don't need a separate conformance group (can
> > > > just be in "basic") include:
> > > > a. Call helper function by address or BTF ID. A runtime that
> > > > doesn't support these simply won't expose any
> > > > such helper functions to BPF programs.
> > > > b. Platform variable instructions (dst = var_addr(imm)). A
> > > > runtime that doesn't support this simply won't
> > > > expose any platform variables to BPF programs.
> > > >
> > > > Comments? (Let the bikeshedding begin...)
> > >
> > > This list seems logical to me,
> >
> > I think we should do just two categories: legacy and the rest, since
> > any scheme will be flawed and infinite bikeshedding will ensue.
>
> If we do this, then aren't we forcing every vendor that adds BPF support to
> support every single instruction if they want to be compliant?
Right, indeed I think it could hinder BPF adoption if we required every
single instruction in any hardware offload card that wanted to add BPF support.
> > For example, let's take a look at #2 atomic...
> > Should it include or exclude atomic_add insn ? It was added at the
> > very beginning of BPF ISA and was used from day one.
> > Without it it's impossible to count stats. The typical network or
> > tracing use case needs to count events and one cannot do it without
> > atomic increment. Eventually per-cpu maps were added as an alternative.
> > I suspect any platform that supports #1 basic insn without atomic_add
> > will not be practically useful.
> > Should atomic_add be a part of "basic" then? But it's atomic.
> > Then what about atomic_fetch_add insn? It's pretty close semantically.
> > Part of atomic or part of basic?
>
> I think it's reasonable to expect that if you require an atomic add, that you
> may also require the other atomic instructions as well and that it would be
> logical to group them together, yes. I believe that Netronome supports all of
> the atomic instructions, as one example. If you're providing a BPF runtime in
> an environment where atomic adds are required, I think it stands to reason
> that you should probably support the other atomics as well, no?
I agree.
> > Another example, #3 divide. bpf cpu=v1 ISA only has unsigned div/mod.
> > Eventually we added a signed version. Integer division is one of the
> > slowest operations in a HW. Different cpus have different flavors of
> > them 64/32 64/64 32/32, etc. All with different quirks.
> > cpu=v1 had modulo insn because in tracing one often needs to do it to
> > select a slot in a table, but in networking there is rarely a need.
> > So bpf offload into netronome HW doesn't support it (iirc).
>
> Correct, my understanding is that BPF offload in netronome supports neither
> division nor modulo.
In my opinion, this is a valid technical reason to put them into a separate
conformance group, to allow hardware offload cards to support BPF without
requiring division/modulo which they might not have space or other budget for.
> > Should div/mod signed/unsigned be a part of basic? or separate?
> > Only 32 or 64 bit?
> >
> > Hence my point: legacy and the rest (as of cpu=v4) are the only two
> > categories we should have in _this_ version of the standard.
> > Rest assured we will add new insn in the coming months.
> > I suggest we figure out conformance groups for future insns at that time.
> > That would be the time to argue and actually extract value out of
> discussion.
> > Retroactive bike shedding is a bike shedding and nothing else.
>
> I wouldn't personally categorize this as retroactive _bikeshedding_.
_ bikeshedding_ typically refers to spending a disproportionate amount of
time on trivial matters. The discussion here isn't about a trivial matter,
in my view, it's about encouraging adoption of BPF in standardized ways
that can be reasoned about and strike a reasonable balance between
platform complexity/cost and user-perceivable complexity.
> What we're trying to do is retroactive _grouping_, and I think what you're
> really arguing for is that grouping in general isn't necessary.
> So I think we should maybe take a step back and talk about what value it
> brings at a higher level to determine if the complexity / ambiguity of grouping
> is worth the benefit.
>
> From my perspective, the reason that we want conformance groups is purely
> for compliance and cross compatibility. If someone has a BPF program that
> does some class of operations, then conformance groups inform them about
> whether their prog will be able to run on some vendor implementation of
> BPF. For example, if you're doing network packet filtering and doing some
> atomics, hashing, etc on a Netronome NIC, you'd like for it to be able to run
> on other NICs that implement offload as well. If a NIC isn't compliant with the
> atomics group, it won't be able to support your prog.
>
> If we don't have conformance groups, I don't see how we can provide that
> guarantee. I think there's essentially a 0% chance that vendors will implement
> every instruction; nor should they have to. So if we just do "legacy" and
> "other", the grouping won't really tell a vendor or BPF developer anything
> other than what instructions are completely useless and should be avoided.
+1 to all of above.
> If we want to get rid of conformance groups that's fine and I do think there's
> an argument for it, but I think we need to discuss this in terms of compliance
> and not the grouping aspect.
>
> FWIW, my perspective is that we should be aiming to enable compliance.
> I don't see any reason why a BPF prog that's offloaded to a NIC to do packet
> filtering shouldn't be able to e.g. run on multiple devices.
> That certainly won't be the case for every type of BPF program, but classifying
> groups of instructions does seem prudent.
I agree with David's assessment above.
Dave
WARNING: multiple messages have this Message-ID (diff)
From: dthaler1968=40googlemail.com@dmarc.ietf.org
To: "'David Vernet'" <void@manifault.com>
Cc: <bpf@ietf.org>, "'bpf'" <bpf@vger.kernel.org>
Subject: Re: [Bpf] BPF ISA conformance groups
Date: Tue, 12 Dec 2023 14:01:09 -0800 [thread overview]
Message-ID: <157b01da2d46$b7453e20$25cfba60$@gmail.com> (raw)
Message-ID: <20231212220109.wDnGe82PgcOjLjRObBvONNb0F84z-5IDvGh4N3424mE@z> (raw)
In-Reply-To: <20231212214532.GB1222@maniforge>
David Vernet <void@manifault.com> wrote:
[...]
> > > > A given runtime platform would be compliant to some set of the
> > > > following conformance groups:
> > > >
> > > > 1. "basic": all instructions not covered by another group below.
> > > > 2. "atomic": all Atomic operations. I think Christoph argued for
> > > > this one in the meeting.
> > > > 3. "divide": all division and modulo operations. Alexei said in
> > > > the meeting that he'd heard demand for this one.
> > > > 4. "legacy": all legacy packet access instructions (deprecated).
> > > > 5. "map": 64-bit immediate instructions that deal with map fds or
> > > > map indices.
> > > > 6. "code": 64-bit immediate instruction that has a "code pointer" type.
> > > > 7. "func": program-local functions.
> > >
> > > I thought for a while about whether this should be part of the basic
> > > conformance group, and talked through it with Jakub Kicinski. I do
> > > think it makes sense to keep it separate like this. For e.g. devices
> > > with Harvard architectures, it could get quite non-trivial for the
> > > verifier to determine whether accesses to arguments stored in
> > > special register are safe. Definitely not impossible, and overall
> > > very useful to support this, but in order to ease vendor adoption
> > > it's probably best to keep this separate.
> > >
> > > > Things that I *think* don't need a separate conformance group (can
> > > > just be in "basic") include:
> > > > a. Call helper function by address or BTF ID. A runtime that
> > > > doesn't support these simply won't expose any
> > > > such helper functions to BPF programs.
> > > > b. Platform variable instructions (dst = var_addr(imm)). A
> > > > runtime that doesn't support this simply won't
> > > > expose any platform variables to BPF programs.
> > > >
> > > > Comments? (Let the bikeshedding begin...)
> > >
> > > This list seems logical to me,
> >
> > I think we should do just two categories: legacy and the rest, since
> > any scheme will be flawed and infinite bikeshedding will ensue.
>
> If we do this, then aren't we forcing every vendor that adds BPF support to
> support every single instruction if they want to be compliant?
Right, indeed I think it could hinder BPF adoption if we required every
single instruction in any hardware offload card that wanted to add BPF support.
> > For example, let's take a look at #2 atomic...
> > Should it include or exclude atomic_add insn ? It was added at the
> > very beginning of BPF ISA and was used from day one.
> > Without it it's impossible to count stats. The typical network or
> > tracing use case needs to count events and one cannot do it without
> > atomic increment. Eventually per-cpu maps were added as an alternative.
> > I suspect any platform that supports #1 basic insn without atomic_add
> > will not be practically useful.
> > Should atomic_add be a part of "basic" then? But it's atomic.
> > Then what about atomic_fetch_add insn? It's pretty close semantically.
> > Part of atomic or part of basic?
>
> I think it's reasonable to expect that if you require an atomic add, that you
> may also require the other atomic instructions as well and that it would be
> logical to group them together, yes. I believe that Netronome supports all of
> the atomic instructions, as one example. If you're providing a BPF runtime in
> an environment where atomic adds are required, I think it stands to reason
> that you should probably support the other atomics as well, no?
I agree.
> > Another example, #3 divide. bpf cpu=v1 ISA only has unsigned div/mod.
> > Eventually we added a signed version. Integer division is one of the
> > slowest operations in a HW. Different cpus have different flavors of
> > them 64/32 64/64 32/32, etc. All with different quirks.
> > cpu=v1 had modulo insn because in tracing one often needs to do it to
> > select a slot in a table, but in networking there is rarely a need.
> > So bpf offload into netronome HW doesn't support it (iirc).
>
> Correct, my understanding is that BPF offload in netronome supports neither
> division nor modulo.
In my opinion, this is a valid technical reason to put them into a separate
conformance group, to allow hardware offload cards to support BPF without
requiring division/modulo which they might not have space or other budget for.
> > Should div/mod signed/unsigned be a part of basic? or separate?
> > Only 32 or 64 bit?
> >
> > Hence my point: legacy and the rest (as of cpu=v4) are the only two
> > categories we should have in _this_ version of the standard.
> > Rest assured we will add new insn in the coming months.
> > I suggest we figure out conformance groups for future insns at that time.
> > That would be the time to argue and actually extract value out of
> discussion.
> > Retroactive bike shedding is a bike shedding and nothing else.
>
> I wouldn't personally categorize this as retroactive _bikeshedding_.
_ bikeshedding_ typically refers to spending a disproportionate amount of
time on trivial matters. The discussion here isn't about a trivial matter,
in my view, it's about encouraging adoption of BPF in standardized ways
that can be reasoned about and strike a reasonable balance between
platform complexity/cost and user-perceivable complexity.
> What we're trying to do is retroactive _grouping_, and I think what you're
> really arguing for is that grouping in general isn't necessary.
> So I think we should maybe take a step back and talk about what value it
> brings at a higher level to determine if the complexity / ambiguity of grouping
> is worth the benefit.
>
> From my perspective, the reason that we want conformance groups is purely
> for compliance and cross compatibility. If someone has a BPF program that
> does some class of operations, then conformance groups inform them about
> whether their prog will be able to run on some vendor implementation of
> BPF. For example, if you're doing network packet filtering and doing some
> atomics, hashing, etc on a Netronome NIC, you'd like for it to be able to run
> on other NICs that implement offload as well. If a NIC isn't compliant with the
> atomics group, it won't be able to support your prog.
>
> If we don't have conformance groups, I don't see how we can provide that
> guarantee. I think there's essentially a 0% chance that vendors will implement
> every instruction; nor should they have to. So if we just do "legacy" and
> "other", the grouping won't really tell a vendor or BPF developer anything
> other than what instructions are completely useless and should be avoided.
+1 to all of above.
> If we want to get rid of conformance groups that's fine and I do think there's
> an argument for it, but I think we need to discuss this in terms of compliance
> and not the grouping aspect.
>
> FWIW, my perspective is that we should be aiming to enable compliance.
> I don't see any reason why a BPF prog that's offloaded to a NIC to do packet
> filtering shouldn't be able to e.g. run on multiple devices.
> That certainly won't be the case for every type of BPF program, but classifying
> groups of instructions does seem prudent.
I agree with David's assessment above.
Dave
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
next prev parent reply other threads:[~2023-12-12 22:01 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-27 20:18 IETF 118 BPF WG summary David Vernet
2023-11-27 20:18 ` [Bpf] " David Vernet
2023-11-28 9:43 ` Michael Richardson
2023-11-28 9:43 ` Michael Richardson
2023-12-02 19:51 ` BPF ISA conformance groups dthaler1968
2023-12-02 19:51 ` [Bpf] " dthaler1968=40googlemail.com
2023-12-07 21:51 ` David Vernet
2023-12-07 21:51 ` David Vernet
2023-12-10 3:10 ` Alexei Starovoitov
2023-12-10 3:10 ` Alexei Starovoitov
2023-12-10 21:13 ` Watson Ladd
2023-12-10 21:13 ` Watson Ladd
2023-12-12 21:45 ` David Vernet
2023-12-12 21:45 ` David Vernet
2023-12-12 22:01 ` dthaler1968 [this message]
2023-12-12 22:01 ` dthaler1968=40googlemail.com
2023-12-12 22:55 ` Alexei Starovoitov
2023-12-12 22:55 ` Alexei Starovoitov
2023-12-12 23:35 ` David Vernet
2023-12-12 23:35 ` David Vernet
2023-12-13 1:32 ` Alexei Starovoitov
2023-12-13 1:32 ` Alexei Starovoitov
2023-12-13 18:56 ` David Vernet
2023-12-13 18:56 ` David Vernet
2023-12-14 0:12 ` Alexei Starovoitov
2023-12-14 0:12 ` Alexei Starovoitov
2023-12-14 17:44 ` David Vernet
2023-12-14 17:44 ` David Vernet
2023-12-15 5:29 ` Christoph Hellwig
2023-12-15 5:29 ` Christoph Hellwig
2023-12-19 1:15 ` Alexei Starovoitov
2023-12-19 1:15 ` Alexei Starovoitov
2023-12-19 18:10 ` dthaler1968
2023-12-19 18:10 ` dthaler1968=40googlemail.com
2023-12-20 3:28 ` Alexei Starovoitov
2023-12-20 3:28 ` Alexei Starovoitov
2023-12-21 7:00 ` Christoph Hellwig
2023-12-21 7:00 ` Christoph Hellwig
2024-01-05 22:07 ` David Vernet
2024-01-05 22:07 ` David Vernet
2024-01-08 16:00 ` Christoph Hellwig
2024-01-08 21:51 ` Alexei Starovoitov
2024-01-08 21:51 ` Alexei Starovoitov
2024-01-09 11:35 ` Jose E. Marchesi
2024-01-09 11:35 ` Jose E. Marchesi
2024-01-23 21:39 ` David Vernet
2024-01-23 21:39 ` David Vernet
2024-01-23 23:29 ` dthaler1968
2024-01-23 23:29 ` dthaler1968=40googlemail.com
2024-01-25 2:55 ` Alexei Starovoitov
2024-01-25 2:55 ` Alexei Starovoitov
2024-01-09 15:26 ` Christoph Hellwig
2023-12-19 18:15 ` dthaler1968
2023-12-19 18:15 ` dthaler1968=40googlemail.com
2023-12-13 16:59 ` Christoph Hellwig
2023-12-13 16:59 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='157b01da2d46$b7453e20$25cfba60$@gmail.com' \
--to=dthaler1968@googlemail.com \
--cc=bpf@ietf.org \
--cc=bpf@vger.kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox