From: dthaler1968@googlemail.com
To: "'David Vernet'" <void@manifault.com>
Cc: <bpf@ietf.org>, "'bpf'" <bpf@vger.kernel.org>
Subject: RE: [Bpf] BPF ISA conformance groups
Date: Tue, 12 Dec 2023 14:01:09 -0800 [thread overview]
Message-ID: <157b01da2d46$b7453e20$25cfba60$@gmail.com> (raw)
In-Reply-To: <20231212214532.GB1222@maniforge>
David Vernet <void@manifault.com> wrote:
[...]
> > > > A given runtime platform would be compliant to some set of the
> > > > following conformance groups:
> > > >
> > > > 1. "basic": all instructions not covered by another group below.
> > > > 2. "atomic": all Atomic operations. I think Christoph argued for
> > > > this one in the meeting.
> > > > 3. "divide": all division and modulo operations. Alexei said in
> > > > the meeting that he'd heard demand for this one.
> > > > 4. "legacy": all legacy packet access instructions (deprecated).
> > > > 5. "map": 64-bit immediate instructions that deal with map fds or
> > > > map indices.
> > > > 6. "code": 64-bit immediate instruction that has a "code pointer" type.
> > > > 7. "func": program-local functions.
> > >
> > > I thought for a while about whether this should be part of the basic
> > > conformance group, and talked through it with Jakub Kicinski. I do
> > > think it makes sense to keep it separate like this. For e.g. devices
> > > with Harvard architectures, it could get quite non-trivial for the
> > > verifier to determine whether accesses to arguments stored in
> > > special register are safe. Definitely not impossible, and overall
> > > very useful to support this, but in order to ease vendor adoption
> > > it's probably best to keep this separate.
> > >
> > > > Things that I *think* don't need a separate conformance group (can
> > > > just be in "basic") include:
> > > > a. Call helper function by address or BTF ID. A runtime that
> > > > doesn't support these simply won't expose any
> > > > such helper functions to BPF programs.
> > > > b. Platform variable instructions (dst = var_addr(imm)). A
> > > > runtime that doesn't support this simply won't
> > > > expose any platform variables to BPF programs.
> > > >
> > > > Comments? (Let the bikeshedding begin...)
> > >
> > > This list seems logical to me,
> >
> > I think we should do just two categories: legacy and the rest, since
> > any scheme will be flawed and infinite bikeshedding will ensue.
>
> If we do this, then aren't we forcing every vendor that adds BPF support to
> support every single instruction if they want to be compliant?
Right, indeed I think it could hinder BPF adoption if we required every
single instruction in any hardware offload card that wanted to add BPF support.
> > For example, let's take a look at #2 atomic...
> > Should it include or exclude atomic_add insn ? It was added at the
> > very beginning of BPF ISA and was used from day one.
> > Without it it's impossible to count stats. The typical network or
> > tracing use case needs to count events and one cannot do it without
> > atomic increment. Eventually per-cpu maps were added as an alternative.
> > I suspect any platform that supports #1 basic insn without atomic_add
> > will not be practically useful.
> > Should atomic_add be a part of "basic" then? But it's atomic.
> > Then what about atomic_fetch_add insn? It's pretty close semantically.
> > Part of atomic or part of basic?
>
> I think it's reasonable to expect that if you require an atomic add, that you
> may also require the other atomic instructions as well and that it would be
> logical to group them together, yes. I believe that Netronome supports all of
> the atomic instructions, as one example. If you're providing a BPF runtime in
> an environment where atomic adds are required, I think it stands to reason
> that you should probably support the other atomics as well, no?
I agree.
> > Another example, #3 divide. bpf cpu=v1 ISA only has unsigned div/mod.
> > Eventually we added a signed version. Integer division is one of the
> > slowest operations in a HW. Different cpus have different flavors of
> > them 64/32 64/64 32/32, etc. All with different quirks.
> > cpu=v1 had modulo insn because in tracing one often needs to do it to
> > select a slot in a table, but in networking there is rarely a need.
> > So bpf offload into netronome HW doesn't support it (iirc).
>
> Correct, my understanding is that BPF offload in netronome supports neither
> division nor modulo.
In my opinion, this is a valid technical reason to put them into a separate
conformance group, to allow hardware offload cards to support BPF without
requiring division/modulo which they might not have space or other budget for.
> > Should div/mod signed/unsigned be a part of basic? or separate?
> > Only 32 or 64 bit?
> >
> > Hence my point: legacy and the rest (as of cpu=v4) are the only two
> > categories we should have in _this_ version of the standard.
> > Rest assured we will add new insn in the coming months.
> > I suggest we figure out conformance groups for future insns at that time.
> > That would be the time to argue and actually extract value out of
> discussion.
> > Retroactive bike shedding is a bike shedding and nothing else.
>
> I wouldn't personally categorize this as retroactive _bikeshedding_.
_ bikeshedding_ typically refers to spending a disproportionate amount of
time on trivial matters. The discussion here isn't about a trivial matter,
in my view, it's about encouraging adoption of BPF in standardized ways
that can be reasoned about and strike a reasonable balance between
platform complexity/cost and user-perceivable complexity.
> What we're trying to do is retroactive _grouping_, and I think what you're
> really arguing for is that grouping in general isn't necessary.
> So I think we should maybe take a step back and talk about what value it
> brings at a higher level to determine if the complexity / ambiguity of grouping
> is worth the benefit.
>
> From my perspective, the reason that we want conformance groups is purely
> for compliance and cross compatibility. If someone has a BPF program that
> does some class of operations, then conformance groups inform them about
> whether their prog will be able to run on some vendor implementation of
> BPF. For example, if you're doing network packet filtering and doing some
> atomics, hashing, etc on a Netronome NIC, you'd like for it to be able to run
> on other NICs that implement offload as well. If a NIC isn't compliant with the
> atomics group, it won't be able to support your prog.
>
> If we don't have conformance groups, I don't see how we can provide that
> guarantee. I think there's essentially a 0% chance that vendors will implement
> every instruction; nor should they have to. So if we just do "legacy" and
> "other", the grouping won't really tell a vendor or BPF developer anything
> other than what instructions are completely useless and should be avoided.
+1 to all of above.
> If we want to get rid of conformance groups that's fine and I do think there's
> an argument for it, but I think we need to discuss this in terms of compliance
> and not the grouping aspect.
>
> FWIW, my perspective is that we should be aiming to enable compliance.
> I don't see any reason why a BPF prog that's offloaded to a NIC to do packet
> filtering shouldn't be able to e.g. run on multiple devices.
> That certainly won't be the case for every type of BPF program, but classifying
> groups of instructions does seem prudent.
I agree with David's assessment above.
Dave
WARNING: multiple messages have this Message-ID (diff)
From: dthaler1968=40googlemail.com@dmarc.ietf.org
To: "'David Vernet'" <void@manifault.com>
Cc: <bpf@ietf.org>, "'bpf'" <bpf@vger.kernel.org>
Subject: Re: [Bpf] BPF ISA conformance groups
Date: Tue, 12 Dec 2023 14:01:09 -0800 [thread overview]
Message-ID: <157b01da2d46$b7453e20$25cfba60$@gmail.com> (raw)
Message-ID: <20231212220109.wDnGe82PgcOjLjRObBvONNb0F84z-5IDvGh4N3424mE@z> (raw)
In-Reply-To: <20231212214532.GB1222@maniforge>
David Vernet <void@manifault.com> wrote:
[...]
> > > > A given runtime platform would be compliant to some set of the
> > > > following conformance groups:
> > > >
> > > > 1. "basic": all instructions not covered by another group below.
> > > > 2. "atomic": all Atomic operations. I think Christoph argued for
> > > > this one in the meeting.
> > > > 3. "divide": all division and modulo operations. Alexei said in
> > > > the meeting that he'd heard demand for this one.
> > > > 4. "legacy": all legacy packet access instructions (deprecated).
> > > > 5. "map": 64-bit immediate instructions that deal with map fds or
> > > > map indices.
> > > > 6. "code": 64-bit immediate instruction that has a "code pointer" type.
> > > > 7. "func": program-local functions.
> > >
> > > I thought for a while about whether this should be part of the basic
> > > conformance group, and talked through it with Jakub Kicinski. I do
> > > think it makes sense to keep it separate like this. For e.g. devices
> > > with Harvard architectures, it could get quite non-trivial for the
> > > verifier to determine whether accesses to arguments stored in
> > > special register are safe. Definitely not impossible, and overall
> > > very useful to support this, but in order to ease vendor adoption
> > > it's probably best to keep this separate.
> > >
> > > > Things that I *think* don't need a separate conformance group (can
> > > > just be in "basic") include:
> > > > a. Call helper function by address or BTF ID. A runtime that
> > > > doesn't support these simply won't expose any
> > > > such helper functions to BPF programs.
> > > > b. Platform variable instructions (dst = var_addr(imm)). A
> > > > runtime that doesn't support this simply won't
> > > > expose any platform variables to BPF programs.
> > > >
> > > > Comments? (Let the bikeshedding begin...)
> > >
> > > This list seems logical to me,
> >
> > I think we should do just two categories: legacy and the rest, since
> > any scheme will be flawed and infinite bikeshedding will ensue.
>
> If we do this, then aren't we forcing every vendor that adds BPF support to
> support every single instruction if they want to be compliant?
Right, indeed I think it could hinder BPF adoption if we required every
single instruction in any hardware offload card that wanted to add BPF support.
> > For example, let's take a look at #2 atomic...
> > Should it include or exclude atomic_add insn ? It was added at the
> > very beginning of BPF ISA and was used from day one.
> > Without it it's impossible to count stats. The typical network or
> > tracing use case needs to count events and one cannot do it without
> > atomic increment. Eventually per-cpu maps were added as an alternative.
> > I suspect any platform that supports #1 basic insn without atomic_add
> > will not be practically useful.
> > Should atomic_add be a part of "basic" then? But it's atomic.
> > Then what about atomic_fetch_add insn? It's pretty close semantically.
> > Part of atomic or part of basic?
>
> I think it's reasonable to expect that if you require an atomic add, that you
> may also require the other atomic instructions as well and that it would be
> logical to group them together, yes. I believe that Netronome supports all of
> the atomic instructions, as one example. If you're providing a BPF runtime in
> an environment where atomic adds are required, I think it stands to reason
> that you should probably support the other atomics as well, no?
I agree.
> > Another example, #3 divide. bpf cpu=v1 ISA only has unsigned div/mod.
> > Eventually we added a signed version. Integer division is one of the
> > slowest operations in a HW. Different cpus have different flavors of
> > them 64/32 64/64 32/32, etc. All with different quirks.
> > cpu=v1 had modulo insn because in tracing one often needs to do it to
> > select a slot in a table, but in networking there is rarely a need.
> > So bpf offload into netronome HW doesn't support it (iirc).
>
> Correct, my understanding is that BPF offload in netronome supports neither
> division nor modulo.
In my opinion, this is a valid technical reason to put them into a separate
conformance group, to allow hardware offload cards to support BPF without
requiring division/modulo which they might not have space or other budget for.
> > Should div/mod signed/unsigned be a part of basic? or separate?
> > Only 32 or 64 bit?
> >
> > Hence my point: legacy and the rest (as of cpu=v4) are the only two
> > categories we should have in _this_ version of the standard.
> > Rest assured we will add new insn in the coming months.
> > I suggest we figure out conformance groups for future insns at that time.
> > That would be the time to argue and actually extract value out of
> discussion.
> > Retroactive bike shedding is a bike shedding and nothing else.
>
> I wouldn't personally categorize this as retroactive _bikeshedding_.
_ bikeshedding_ typically refers to spending a disproportionate amount of
time on trivial matters. The discussion here isn't about a trivial matter,
in my view, it's about encouraging adoption of BPF in standardized ways
that can be reasoned about and strike a reasonable balance between
platform complexity/cost and user-perceivable complexity.
> What we're trying to do is retroactive _grouping_, and I think what you're
> really arguing for is that grouping in general isn't necessary.
> So I think we should maybe take a step back and talk about what value it
> brings at a higher level to determine if the complexity / ambiguity of grouping
> is worth the benefit.
>
> From my perspective, the reason that we want conformance groups is purely
> for compliance and cross compatibility. If someone has a BPF program that
> does some class of operations, then conformance groups inform them about
> whether their prog will be able to run on some vendor implementation of
> BPF. For example, if you're doing network packet filtering and doing some
> atomics, hashing, etc on a Netronome NIC, you'd like for it to be able to run
> on other NICs that implement offload as well. If a NIC isn't compliant with the
> atomics group, it won't be able to support your prog.
>
> If we don't have conformance groups, I don't see how we can provide that
> guarantee. I think there's essentially a 0% chance that vendors will implement
> every instruction; nor should they have to. So if we just do "legacy" and
> "other", the grouping won't really tell a vendor or BPF developer anything
> other than what instructions are completely useless and should be avoided.
+1 to all of above.
> If we want to get rid of conformance groups that's fine and I do think there's
> an argument for it, but I think we need to discuss this in terms of compliance
> and not the grouping aspect.
>
> FWIW, my perspective is that we should be aiming to enable compliance.
> I don't see any reason why a BPF prog that's offloaded to a NIC to do packet
> filtering shouldn't be able to e.g. run on multiple devices.
> That certainly won't be the case for every type of BPF program, but classifying
> groups of instructions does seem prudent.
I agree with David's assessment above.
Dave
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
next prev parent reply other threads:[~2023-12-12 22:01 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-27 20:18 IETF 118 BPF WG summary David Vernet
2023-11-27 20:18 ` [Bpf] " David Vernet
2023-11-28 9:43 ` Michael Richardson
2023-11-28 9:43 ` Michael Richardson
2023-12-02 19:51 ` BPF ISA conformance groups dthaler1968
2023-12-02 19:51 ` [Bpf] " dthaler1968=40googlemail.com
2023-12-07 21:51 ` David Vernet
2023-12-07 21:51 ` David Vernet
2023-12-10 3:10 ` Alexei Starovoitov
2023-12-10 3:10 ` Alexei Starovoitov
2023-12-10 21:13 ` Watson Ladd
2023-12-10 21:13 ` Watson Ladd
2023-12-12 21:45 ` David Vernet
2023-12-12 21:45 ` David Vernet
2023-12-12 22:01 ` dthaler1968 [this message]
2023-12-12 22:01 ` dthaler1968=40googlemail.com
2023-12-12 22:55 ` Alexei Starovoitov
2023-12-12 22:55 ` Alexei Starovoitov
2023-12-12 23:35 ` David Vernet
2023-12-12 23:35 ` David Vernet
2023-12-13 1:32 ` Alexei Starovoitov
2023-12-13 1:32 ` Alexei Starovoitov
2023-12-13 18:56 ` David Vernet
2023-12-13 18:56 ` David Vernet
2023-12-14 0:12 ` Alexei Starovoitov
2023-12-14 0:12 ` Alexei Starovoitov
2023-12-14 17:44 ` David Vernet
2023-12-14 17:44 ` David Vernet
2023-12-15 5:29 ` Christoph Hellwig
2023-12-15 5:29 ` Christoph Hellwig
2023-12-19 1:15 ` Alexei Starovoitov
2023-12-19 1:15 ` Alexei Starovoitov
2023-12-19 18:10 ` dthaler1968
2023-12-19 18:10 ` dthaler1968=40googlemail.com
2023-12-20 3:28 ` Alexei Starovoitov
2023-12-20 3:28 ` Alexei Starovoitov
2023-12-21 7:00 ` Christoph Hellwig
2023-12-21 7:00 ` Christoph Hellwig
2024-01-05 22:07 ` David Vernet
2024-01-05 22:07 ` David Vernet
2024-01-08 16:00 ` Christoph Hellwig
2024-01-08 21:51 ` Alexei Starovoitov
2024-01-08 21:51 ` Alexei Starovoitov
2024-01-09 11:35 ` Jose E. Marchesi
2024-01-09 11:35 ` Jose E. Marchesi
2024-01-23 21:39 ` David Vernet
2024-01-23 21:39 ` David Vernet
2024-01-23 23:29 ` dthaler1968
2024-01-23 23:29 ` dthaler1968=40googlemail.com
2024-01-25 2:55 ` Alexei Starovoitov
2024-01-25 2:55 ` Alexei Starovoitov
2024-01-09 15:26 ` Christoph Hellwig
2023-12-19 18:15 ` dthaler1968
2023-12-19 18:15 ` dthaler1968=40googlemail.com
2023-12-13 16:59 ` Christoph Hellwig
2023-12-13 16:59 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='157b01da2d46$b7453e20$25cfba60$@gmail.com' \
--to=dthaler1968@googlemail.com \
--cc=bpf@ietf.org \
--cc=bpf@vger.kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.