From: Jiri Pirko <jiri@resnulli.us>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
netdev@vger.kernel.org, deb.chatterjee@intel.com,
anjali.singhai@intel.com, Vipin.Jain@amd.com,
namrata.limaye@intel.com, tom@sipanda.io, mleitner@redhat.com,
Mahesh.Shirshyad@amd.com, tomasz.osinski@intel.com,
xiyou.wangcong@gmail.com, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
vladbu@nvidia.com, horms@kernel.org, daniel@iogearbox.net,
bpf@vger.kernel.org, khalidm@nvidia.com, toke@redhat.com,
mattyk@nvidia.com, dan.daly@intel.com,
chris.sommers@keysight.com, john.andy.fingerhut@intel.com
Subject: Re: [PATCH net-next v8 00/15] Introducing P4TC
Date: Mon, 20 Nov 2023 19:10:14 +0100 [thread overview]
Message-ID: <ZVuhBlYRwi8eGiSF@nanopsycho> (raw)
In-Reply-To: <CAM0EoMm3whh6xaAdKcT=a9FcSE4EMn=xJxkXY5ked=nwGaGFeQ@mail.gmail.com>
Mon, Nov 20, 2023 at 03:23:59PM CET, jhs@mojatatu.com wrote:
>On Mon, Nov 20, 2023 at 4:39 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Fri, Nov 17, 2023 at 09:46:11PM CET, jhs@mojatatu.com wrote:
>> >On Fri, Nov 17, 2023 at 1:37 PM John Fastabend <john.fastabend@gmail.com> wrote:
>> >>
>> >> Jamal Hadi Salim wrote:
>> >> > On Fri, Nov 17, 2023 at 1:27 AM John Fastabend <john.fastabend@gmail.com> wrote:
>> >> > >
>> >> > > Jamal Hadi Salim wrote:
>>
>> [...]
>>
>>
>> >>
>> >> I think I'm judging the technical work here. Bullet points.
>> >>
>> >> 1. p4c-tc implementation looks like it should be slower than a
>> >> in terms of pkts/sec than a bpf implementation. Meaning
>> >> I suspect pipeline and objects laid out like this will lose
>> >> to a BPF program with an parser and single lookup. The p4c-ebpf
>> >> compiler should look to create optimized EBPF code not some
>> >> emulated switch topology.
>> >>
>> >
>> >The parser is ebpf based. The other objects which require control
>> >plane interaction are not - those interact via netlink.
>> >We published perf data a while back - presented at the P4 workshop
>> >back in April (was in the cover letter)
>> >https://github.com/p4tc-dev/docs/blob/main/p4-conference-2023/2023P4WorkshopP4TC.pdf
>> >But do note: the correct abstraction is the first priority.
>> >Optimization is something we can teach the compiler over time. But
>> >even with the minimalist code generation you can see that our approach
>> >always beats ebpf in LPM and ternary. The other ones I am pretty sure
>>
>> Any idea why? Perhaps the existing eBPF maps are not that suitable for
>> this kinds of lookups? I mean in theory, eBPF should be always faster.
>
>We didnt look closely; however, that is not the point - the point is
>the perf difference if there is one, is not big with the big win being
>proper P4 abstraction. For LPM for sure our algorithmic approach is
>better. For ternary the compute intensity in looping is better done in
>C. And for exact i believe that ebpf uses better hashing.
>Again, that is not the point we were trying to validate in those experiments..
>
>On your point of "maps are not that suitable" P4 tables tend to have
>very specific attributes (examples associated meters, counters,
>default hit and miss actions, etc).
>
>> >we can optimize over time.
>> >Your view of "single lookup" is true for simple programs but if you
>> >have 10 tables trying to model a 5G function then it doesnt make sense
>> >(and i think the data we published was clear that you gain no
>> >advantage using ebpf - as a matter of fact there was no perf
>> >difference between XDP and tc in such cases).
>> >
>> >> 2. p4c-tc control plan looks slower than a directly mmaped bpf
>> >> map. Doing a simple update vs a netlink msg. The argument
>> >> that BPF can't do CRUD (which we had offlist) seems incorrect
>> >> to me. Correct me if I'm wrong with details about why.
>> >>
>> >
>> >So let me see....
>> >you want me to replace netlink and all its features and rewrite it
>> >using the ebpf system calls? Congestion control, event handling,
>> >arbitrary message crafting, etc and the years of work that went into
>> >netlink? NO to the HELL.
>>
>> Wait, I don't think John suggests anything like that. He just suggests
>> to have the tables as eBPF maps.
>
>What's the difference? Unless maps can do netlink.
>
>> Honestly, I don't understand the
>> fixation on netlink. Its socket messaging, memcpies, processing
>> overhead, etc can't keep up with mmaped memory access at scale. Measure
>> that and I bet you'll get drastically different results.
>>
>> I mean, netlink is good for a lot of things, but does not mean it is an
>> universal answer to userspace<->kernel data passing.
>
>Here's a small sample of our requirements that are satisfied by
>netlink for P4 object hierarchy[1]:
>1. Msg construction/parsing
>2. Multi-user request/response messaging
What is actually a usecase for having multiple users program p4 pipeline
in parallel?
>3. Multi-user event subscribe/publish messaging
Same here. What is the usecase for multiple users receiving p4 events?
>
>I dont think i need to provide an explanation on the differences here
>visavis what ebpf system calls provide vs what netlink provides and
>how netlink is a clear fit. If it is not clear i can give more
It is not :/
>breakdown. And of course there's more but above is a good sample.
>
>The part that is taken for granted is the control plane code and
>interaction which is an extremely important detail. P4 Abstraction
>requires hierarchies with different compiler generated encoded path
>ids etc. This ID mapping gets exacerbated by having multitudes of P4
Why the actual eBFP mapping does not serve the same purpose as ID?
ID:mapping
1 :1
?
>programs which have different requirements. Netlink is a natural fit
>for this P4 abstraction. Not to mention the netlink/tc path (and in
>particular the ID mapping) provides a conduit for offload when that is
>needed.
>eBPF is just a tool - and the objects are intended to be generic - and
>i dont see how any of this could be achieved without retooling to make
>it more specific to P4.
>
>cheers,
>jamal
>
>
>
>>
>> >I should note: that there was an interesting talk at netdevconf 0x17
>> >where the speaker showed the challenges of dealing with ebpf on "day
>> >two" - slides or videos are not up yet, but link is:
>> >https://netdevconf.info/0x17/sessions/talk/is-scaling-ebpf-easy-yet-a-small-step-to-one-server-but-giant-leap-to-distributed-network.html
>> >The point the speaker was making is it's always easy to whip an ebpf
>> >program that can slice and dice packets and maybe even flush LEDs but
>> >the real work and challenge is in the control plane. I agree with the
>> >speaker based on my experiences. This discussion of replacing netlink
>> >with ebpf system calls is absolutely a non-starter. Let's just end the
>> >discussion and agree to disagree if you are going to keep insisting on
>> >that.
>>
>>
>> [...]
next prev parent reply other threads:[~2023-11-20 18:10 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-16 14:59 [PATCH net-next v8 00/15] Introducing P4TC Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 01/15] net: sched: act_api: Introduce dynamic actions list Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 03/15] net/sched: act_api: Update tc_action_ops to account for dynamic actions Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 05/15] net: sched: act_api: Add support for preallocated dynamic action instances Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 06/15] net: introduce rcu_replace_pointer_rtnl Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 07/15] rtnl: add helper to check if group has listeners Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 08/15] p4tc: add P4 data types Jamal Hadi Salim
2023-11-16 16:03 ` Jiri Pirko
2023-11-17 12:01 ` Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 09/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2023-11-16 16:11 ` Jiri Pirko
2023-11-17 12:09 ` Jamal Hadi Salim
2023-11-20 8:18 ` Jiri Pirko
2023-11-20 12:48 ` Jamal Hadi Salim
2023-11-20 13:16 ` Jiri Pirko
2023-11-20 15:30 ` Jamal Hadi Salim
2023-11-20 16:25 ` Jiri Pirko
2023-11-20 18:20 ` David Ahern
2023-11-20 20:12 ` Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 10/15] p4tc: add action template create, update, delete, get, flush and dump Jamal Hadi Salim
2023-11-16 16:28 ` Jiri Pirko
2023-11-17 15:11 ` Jamal Hadi Salim
2023-11-20 8:19 ` Jiri Pirko
2023-11-20 13:45 ` Jamal Hadi Salim
2023-11-20 16:25 ` Jiri Pirko
2023-11-17 6:51 ` John Fastabend
2023-11-16 14:59 ` [PATCH net-next v8 11/15] p4tc: add template table " Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 12/15] p4tc: add runtime table entry create, update, get, delete, " Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 13/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2023-11-17 7:09 ` John Fastabend
2023-11-19 9:14 ` kernel test robot
2023-11-20 22:28 ` kernel test robot
2023-11-16 14:59 ` [PATCH net-next v8 14/15] p4tc: add P4 classifier Jamal Hadi Salim
2023-11-17 7:17 ` John Fastabend
2023-11-16 14:59 ` [PATCH net-next v8 15/15] p4tc: Add P4 extern interface Jamal Hadi Salim
2023-11-16 16:42 ` Jiri Pirko
2023-11-17 12:14 ` Jamal Hadi Salim
2023-11-20 8:22 ` Jiri Pirko
2023-11-20 14:02 ` Jamal Hadi Salim
2023-11-20 16:27 ` Jiri Pirko
2023-11-20 19:00 ` Jamal Hadi Salim
2023-11-17 6:27 ` [PATCH net-next v8 00/15] Introducing P4TC John Fastabend
2023-11-17 12:49 ` Jamal Hadi Salim
2023-11-17 18:37 ` John Fastabend
2023-11-17 20:46 ` Jamal Hadi Salim
2023-11-20 9:39 ` Jiri Pirko
2023-11-20 14:23 ` Jamal Hadi Salim
2023-11-20 18:10 ` Jiri Pirko [this message]
2023-11-20 19:56 ` Jamal Hadi Salim
2023-11-20 20:41 ` John Fastabend
2023-11-20 22:13 ` Jamal Hadi Salim
2023-11-20 21:48 ` Daniel Borkmann
2023-11-20 22:56 ` Jamal Hadi Salim
2023-11-21 13:06 ` Jiri Pirko
2023-11-21 13:47 ` Jamal Hadi Salim
2023-11-21 14:19 ` Jiri Pirko
2023-11-21 15:21 ` Jamal Hadi Salim
2023-11-22 9:25 ` Jiri Pirko
2023-11-22 15:14 ` Jamal Hadi Salim
2023-11-22 18:31 ` Jiri Pirko
2023-11-22 18:50 ` John Fastabend
2023-11-22 19:35 ` Jamal Hadi Salim
2023-11-23 6:36 ` Jiri Pirko
2023-11-23 13:22 ` Jamal Hadi Salim
2023-11-23 13:34 ` Jiri Pirko
2023-11-23 13:45 ` Jamal Hadi Salim
2023-11-23 14:07 ` Jiri Pirko
2023-11-23 14:28 ` Jamal Hadi Salim
2023-11-23 15:27 ` Jiri Pirko
2023-11-23 16:30 ` Jamal Hadi Salim
2023-11-23 17:53 ` Edward Cree
2023-11-23 18:09 ` Jiri Pirko
2023-11-23 18:58 ` Jamal Hadi Salim
2023-11-23 18:53 ` Jakub Kicinski
2023-11-23 19:42 ` Tom Herbert
2023-11-24 10:39 ` Jiri Pirko
2023-11-23 18:04 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZVuhBlYRwi8eGiSF@nanopsycho \
--to=jiri@resnulli.us \
--cc=Mahesh.Shirshyad@amd.com \
--cc=Vipin.Jain@amd.com \
--cc=anjali.singhai@intel.com \
--cc=bpf@vger.kernel.org \
--cc=chris.sommers@keysight.com \
--cc=dan.daly@intel.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=deb.chatterjee@intel.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jhs@mojatatu.com \
--cc=john.andy.fingerhut@intel.com \
--cc=john.fastabend@gmail.com \
--cc=khalidm@nvidia.com \
--cc=kuba@kernel.org \
--cc=mattyk@nvidia.com \
--cc=mleitner@redhat.com \
--cc=namrata.limaye@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=toke@redhat.com \
--cc=tom@sipanda.io \
--cc=tomasz.osinski@intel.com \
--cc=vladbu@nvidia.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.