From: David Vernet <void@manifault.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Dave Thaler <dthaler1968@googlemail.com>,
bpf@ietf.org, bpf <bpf@vger.kernel.org>,
Jakub Kicinski <kuba@kernel.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [Bpf] BPF ISA conformance groups
Date: Wed, 13 Dec 2023 12:56:03 -0600 [thread overview]
Message-ID: <20231213185603.GA1968@maniforge> (raw)
In-Reply-To: <CAADnVQJ-JwNTY5fW-oXdTur9aDrv2NQoreTH3yYZemVBVtq9fQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6270 bytes --]
On Tue, Dec 12, 2023 at 05:32:33PM -0800, Alexei Starovoitov wrote:
> On Tue, Dec 12, 2023 at 3:36 PM David Vernet <void@manifault.com> wrote:
> >
> > > It only supports atomic_add and no other atomics.
> >
> > Ahh, I misunderstood when I discussed with Kuba. I guess they supported
> > only atomic_add because packets can be delivered out of order.
>
> Not sure why it has anything to do with packets.
My understanding is that the ordering of packets is an impedance with
the host's ordering model. If you offload a BPF program from the host
which expects to see packets in order, and then issues some atomics
which process the packets in order, it won't work on the device because
the packets are delivered out of order. Kuba (cc'd) can give more
details if he wants, but it doesn't really matter. The salient point is
that the chip could have done all of the BPF atomic instructions and it
wouldn't have been much more work to implement them.
> > So fair
> > enough on that point, but I still stand by the claim though that if you
> > need one type of atomic, it's reasonable to infer that you may need all
> > of them. I would be curious to hear how much work it would have been to
> > add support for the others. If there was an atomic conformance group,
> > maybe they would have.
>
> The netronome wasn't trying to offload this or that insn to be
> in compliance. Together, netronome and bpf folks decided to focus
> on a set of real XDP applications and try to offload as much as practical.
> At that time there were -mcpu=v1 and v2 insn sets only and offloading
> wasn't really working well. alu32 in llvm, verifier and nfp was added
> to make offload practical. Eventually it became -mcpu=v3.
> So compliance with any future group (basic, atomic, etc) in ISA cannot
> be evaluated in isolation, because nfp is not compliant with -mcpu=v4
> and not compliant with -mcpu=v1,
> but works well with -mcpu=v3 while v3 is an extension of v1 and v2.
> Which is nonsensical from standard compliance pov.
> netronome offload is a success because it demonstrated
> how real production XDP applications can run in a NIC at speeds
> that traditional CPUs cannot dream of.
> It's a success despite the complexity and ugliness of BPF ISA.
> It's working because practical applications compiled with -mcpu=v3 produce
> "compliant enough" bpf code.
Something I want to make sure is clearly spelled out: are you of the
opinion that a program written for offload to a Netronome device cannot
and should not ever be able to run on any other NIC with BPF offload?
> > Well, maybe not for Netronome, or maybe not even for any vendor (though
> > we have no way of knowing that yet), but what about for other contexts
> > like Windows / Linux cross-platform compat?
>
> bpf on windows started similar to netronome. The goal was to
> demonstrate real cilium progs running on windows. And it was done.
> Since windows is a software there was no need to add or remove anything
> from ISA, but due to licensing the prevail verifier had to be used which
> doesn't support a whole bunch of things.
> This software deficiencies of non-linux verifier shouldn't be
> dictating grouping of the insns in the standard.
>
> If linux can do it, windows should be able to do it just as well.
> So I see no problem saying that bpf on windows will be non-compliant
> until they support all of -mcpu=v4 insns. It's a software project
> with a deterministic timeline.
>
> The standard should focus on compatibility between
> HW-ish offloads where no amount of software can add support for
> all of -mcpu=v4.
I don't agree that there's no value in standardizing for the sake of
software as well, but yes it's different than what we're trying to
accomplish for hardware, and I agree that hardware is the main customer
here.
Even if you assume that we should completely ignore software and focus
on hardware compatibility though, that seems to be orthogonal to what
you're proposing here. What compatibility are we guaranteeing if there's
no compliance?
> And here I believe compliance with "basic" is not practical.
> When nvme HW architects will get to implement "basic" ISA they might
> realize that it has too much.
> Producing "conformance groups" without HW folks thinking through the
> implementation is not going to be a success.
> I worry that it will have the opposite effect.
> We'll have a standard with basic, atomic, etc.
> Then folks will deliver this standard on the desk of HW architects.
> They will give it a try and will reject the idea of implementing BPF in HW,
> because not implementing "basic" would mean that this vendor
> is not in compliance which means no business.
I don't know enough about how compliance informs the cost-calculus and
decision making of HW vendors to really make an intelligent point here,
but I have to imagine that there's an equally plausible scenario where a
vendor will look at the non-legacy instructions and think, "There's no
possible way we could support all of these instructions", and make the
same decision? Why else would they be asking for a standard if not to
have some guidelines of what to implement?
> Hence the standard shouldn't overfocus on compliance and groups.
> Just legacy and the rest will do for nvme.
> legacy means "don't bother looking at those".
> the rest means "pls implement these insns because they are useful,
> their semantics and encoding is standardized,
> but pick what makes sense for your use case and your HW".
How do we know the semantics of the instructions won't be prohibitively
expensive or impractical for certain vendors? What value do we get out
of dictating semantics in the standard if we're not expecting any of
these programs to be cross-compatible anyways?
> And to make such HW offload a success we'd need to work together.
> compiler, kernel, run-time, hw folks.
Which kernel? Which compiler? If we all need to be in the room every
time a decision is made by any vendor, then what value is the standard
even providing?
> "Here is a standard. Go implement it" won't work.
What is the point of a standard if not to say, "Here's what you should
go implement"?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: David Vernet <void@manifault.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Dave Thaler <dthaler1968@googlemail.com>,
bpf@ietf.org, bpf <bpf@vger.kernel.org>,
Jakub Kicinski <kuba@kernel.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [Bpf] BPF ISA conformance groups
Date: Wed, 13 Dec 2023 12:56:03 -0600 [thread overview]
Message-ID: <20231213185603.GA1968@maniforge> (raw)
Message-ID: <20231213185603.LBjQ2oD-Hv3ejBicd93zJ8hIpMbMmPCs5Edr2tfI1jw@z> (raw)
In-Reply-To: <CAADnVQJ-JwNTY5fW-oXdTur9aDrv2NQoreTH3yYZemVBVtq9fQ@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 6270 bytes --]
On Tue, Dec 12, 2023 at 05:32:33PM -0800, Alexei Starovoitov wrote:
> On Tue, Dec 12, 2023 at 3:36 PM David Vernet <void@manifault.com> wrote:
> >
> > > It only supports atomic_add and no other atomics.
> >
> > Ahh, I misunderstood when I discussed with Kuba. I guess they supported
> > only atomic_add because packets can be delivered out of order.
>
> Not sure why it has anything to do with packets.
My understanding is that the ordering of packets is an impedance with
the host's ordering model. If you offload a BPF program from the host
which expects to see packets in order, and then issues some atomics
which process the packets in order, it won't work on the device because
the packets are delivered out of order. Kuba (cc'd) can give more
details if he wants, but it doesn't really matter. The salient point is
that the chip could have done all of the BPF atomic instructions and it
wouldn't have been much more work to implement them.
> > So fair
> > enough on that point, but I still stand by the claim though that if you
> > need one type of atomic, it's reasonable to infer that you may need all
> > of them. I would be curious to hear how much work it would have been to
> > add support for the others. If there was an atomic conformance group,
> > maybe they would have.
>
> The netronome wasn't trying to offload this or that insn to be
> in compliance. Together, netronome and bpf folks decided to focus
> on a set of real XDP applications and try to offload as much as practical.
> At that time there were -mcpu=v1 and v2 insn sets only and offloading
> wasn't really working well. alu32 in llvm, verifier and nfp was added
> to make offload practical. Eventually it became -mcpu=v3.
> So compliance with any future group (basic, atomic, etc) in ISA cannot
> be evaluated in isolation, because nfp is not compliant with -mcpu=v4
> and not compliant with -mcpu=v1,
> but works well with -mcpu=v3 while v3 is an extension of v1 and v2.
> Which is nonsensical from standard compliance pov.
> netronome offload is a success because it demonstrated
> how real production XDP applications can run in a NIC at speeds
> that traditional CPUs cannot dream of.
> It's a success despite the complexity and ugliness of BPF ISA.
> It's working because practical applications compiled with -mcpu=v3 produce
> "compliant enough" bpf code.
Something I want to make sure is clearly spelled out: are you of the
opinion that a program written for offload to a Netronome device cannot
and should not ever be able to run on any other NIC with BPF offload?
> > Well, maybe not for Netronome, or maybe not even for any vendor (though
> > we have no way of knowing that yet), but what about for other contexts
> > like Windows / Linux cross-platform compat?
>
> bpf on windows started similar to netronome. The goal was to
> demonstrate real cilium progs running on windows. And it was done.
> Since windows is a software there was no need to add or remove anything
> from ISA, but due to licensing the prevail verifier had to be used which
> doesn't support a whole bunch of things.
> This software deficiencies of non-linux verifier shouldn't be
> dictating grouping of the insns in the standard.
>
> If linux can do it, windows should be able to do it just as well.
> So I see no problem saying that bpf on windows will be non-compliant
> until they support all of -mcpu=v4 insns. It's a software project
> with a deterministic timeline.
>
> The standard should focus on compatibility between
> HW-ish offloads where no amount of software can add support for
> all of -mcpu=v4.
I don't agree that there's no value in standardizing for the sake of
software as well, but yes it's different than what we're trying to
accomplish for hardware, and I agree that hardware is the main customer
here.
Even if you assume that we should completely ignore software and focus
on hardware compatibility though, that seems to be orthogonal to what
you're proposing here. What compatibility are we guaranteeing if there's
no compliance?
> And here I believe compliance with "basic" is not practical.
> When nvme HW architects will get to implement "basic" ISA they might
> realize that it has too much.
> Producing "conformance groups" without HW folks thinking through the
> implementation is not going to be a success.
> I worry that it will have the opposite effect.
> We'll have a standard with basic, atomic, etc.
> Then folks will deliver this standard on the desk of HW architects.
> They will give it a try and will reject the idea of implementing BPF in HW,
> because not implementing "basic" would mean that this vendor
> is not in compliance which means no business.
I don't know enough about how compliance informs the cost-calculus and
decision making of HW vendors to really make an intelligent point here,
but I have to imagine that there's an equally plausible scenario where a
vendor will look at the non-legacy instructions and think, "There's no
possible way we could support all of these instructions", and make the
same decision? Why else would they be asking for a standard if not to
have some guidelines of what to implement?
> Hence the standard shouldn't overfocus on compliance and groups.
> Just legacy and the rest will do for nvme.
> legacy means "don't bother looking at those".
> the rest means "pls implement these insns because they are useful,
> their semantics and encoding is standardized,
> but pick what makes sense for your use case and your HW".
How do we know the semantics of the instructions won't be prohibitively
expensive or impractical for certain vendors? What value do we get out
of dictating semantics in the standard if we're not expecting any of
these programs to be cross-compatible anyways?
> And to make such HW offload a success we'd need to work together.
> compiler, kernel, run-time, hw folks.
Which kernel? Which compiler? If we all need to be in the room every
time a decision is made by any vendor, then what value is the standard
even providing?
> "Here is a standard. Go implement it" won't work.
What is the point of a standard if not to say, "Here's what you should
go implement"?
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 76 bytes --]
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
next prev parent reply other threads:[~2023-12-13 18:56 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-27 20:18 IETF 118 BPF WG summary David Vernet
2023-11-27 20:18 ` [Bpf] " David Vernet
2023-11-28 9:43 ` Michael Richardson
2023-11-28 9:43 ` Michael Richardson
2023-12-02 19:51 ` BPF ISA conformance groups dthaler1968
2023-12-02 19:51 ` [Bpf] " dthaler1968=40googlemail.com
2023-12-07 21:51 ` David Vernet
2023-12-07 21:51 ` David Vernet
2023-12-10 3:10 ` Alexei Starovoitov
2023-12-10 3:10 ` Alexei Starovoitov
2023-12-10 21:13 ` Watson Ladd
2023-12-10 21:13 ` Watson Ladd
2023-12-12 21:45 ` David Vernet
2023-12-12 21:45 ` David Vernet
2023-12-12 22:01 ` dthaler1968
2023-12-12 22:01 ` dthaler1968=40googlemail.com
2023-12-12 22:55 ` Alexei Starovoitov
2023-12-12 22:55 ` Alexei Starovoitov
2023-12-12 23:35 ` David Vernet
2023-12-12 23:35 ` David Vernet
2023-12-13 1:32 ` Alexei Starovoitov
2023-12-13 1:32 ` Alexei Starovoitov
2023-12-13 18:56 ` David Vernet [this message]
2023-12-13 18:56 ` David Vernet
2023-12-14 0:12 ` Alexei Starovoitov
2023-12-14 0:12 ` Alexei Starovoitov
2023-12-14 17:44 ` David Vernet
2023-12-14 17:44 ` David Vernet
2023-12-15 5:29 ` Christoph Hellwig
2023-12-15 5:29 ` Christoph Hellwig
2023-12-19 1:15 ` Alexei Starovoitov
2023-12-19 1:15 ` Alexei Starovoitov
2023-12-19 18:10 ` dthaler1968
2023-12-19 18:10 ` dthaler1968=40googlemail.com
2023-12-20 3:28 ` Alexei Starovoitov
2023-12-20 3:28 ` Alexei Starovoitov
2023-12-21 7:00 ` Christoph Hellwig
2023-12-21 7:00 ` Christoph Hellwig
2024-01-05 22:07 ` David Vernet
2024-01-05 22:07 ` David Vernet
2024-01-08 16:00 ` Christoph Hellwig
2024-01-08 21:51 ` Alexei Starovoitov
2024-01-08 21:51 ` Alexei Starovoitov
2024-01-09 11:35 ` Jose E. Marchesi
2024-01-09 11:35 ` Jose E. Marchesi
2024-01-23 21:39 ` David Vernet
2024-01-23 21:39 ` David Vernet
2024-01-23 23:29 ` dthaler1968
2024-01-23 23:29 ` dthaler1968=40googlemail.com
2024-01-25 2:55 ` Alexei Starovoitov
2024-01-25 2:55 ` Alexei Starovoitov
2024-01-09 15:26 ` Christoph Hellwig
2023-12-19 18:15 ` dthaler1968
2023-12-19 18:15 ` dthaler1968=40googlemail.com
2023-12-13 16:59 ` Christoph Hellwig
2023-12-13 16:59 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231213185603.GA1968@maniforge \
--to=void@manifault.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@ietf.org \
--cc=bpf@vger.kernel.org \
--cc=dthaler1968@googlemail.com \
--cc=hch@infradead.org \
--cc=kuba@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox