public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
	netdev@vger.kernel.org, deb.chatterjee@intel.com,
	anjali.singhai@intel.com, Vipin.Jain@amd.com,
	namrata.limaye@intel.com, tom@sipanda.io, mleitner@redhat.com,
	Mahesh.Shirshyad@amd.com, tomasz.osinski@intel.com,
	xiyou.wangcong@gmail.com, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vladbu@nvidia.com, horms@kernel.org, daniel@iogearbox.net,
	bpf@vger.kernel.org, khalidm@nvidia.com, toke@redhat.com,
	mattyk@nvidia.com, dan.daly@intel.com,
	chris.sommers@keysight.com, john.andy.fingerhut@intel.com
Subject: Re: [PATCH net-next v8 00/15] Introducing P4TC
Date: Mon, 20 Nov 2023 19:10:14 +0100	[thread overview]
Message-ID: <ZVuhBlYRwi8eGiSF@nanopsycho> (raw)
In-Reply-To: <CAM0EoMm3whh6xaAdKcT=a9FcSE4EMn=xJxkXY5ked=nwGaGFeQ@mail.gmail.com>

Mon, Nov 20, 2023 at 03:23:59PM CET, jhs@mojatatu.com wrote:
>On Mon, Nov 20, 2023 at 4:39 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Fri, Nov 17, 2023 at 09:46:11PM CET, jhs@mojatatu.com wrote:
>> >On Fri, Nov 17, 2023 at 1:37 PM John Fastabend <john.fastabend@gmail.com> wrote:
>> >>
>> >> Jamal Hadi Salim wrote:
>> >> > On Fri, Nov 17, 2023 at 1:27 AM John Fastabend <john.fastabend@gmail.com> wrote:
>> >> > >
>> >> > > Jamal Hadi Salim wrote:
>>
>> [...]
>>
>>
>> >>
>> >> I think I'm judging the technical work here. Bullet points.
>> >>
>> >> 1. p4c-tc implementation looks like it should be slower than a
>> >>    in terms of pkts/sec than a bpf implementation. Meaning
>> >>    I suspect pipeline and objects laid out like this will lose
>> >>    to a BPF program with an parser and single lookup. The p4c-ebpf
>> >>    compiler should look to create optimized EBPF code not some
>> >>    emulated switch topology.
>> >>
>> >
>> >The parser is ebpf based. The other objects which require control
>> >plane interaction are not - those interact via netlink.
>> >We published perf data a while back - presented at the P4 workshop
>> >back in April (was in the cover letter)
>> >https://github.com/p4tc-dev/docs/blob/main/p4-conference-2023/2023P4WorkshopP4TC.pdf
>> >But do note: the correct abstraction is the first priority.
>> >Optimization is something we can teach the compiler over time. But
>> >even with the minimalist code generation you can see that our approach
>> >always beats ebpf in LPM and ternary. The other ones I am pretty sure
>>
>> Any idea why? Perhaps the existing eBPF maps are not that suitable for
>> this kinds of lookups? I mean in theory, eBPF should be always faster.
>
>We didnt look closely; however, that is not the point - the point is
>the perf difference if there is one, is not big with the big win being
>proper P4 abstraction. For LPM for sure our algorithmic approach is
>better. For ternary the compute intensity in looping is better done in
>C. And for exact i believe that ebpf uses better hashing.
>Again, that is not the point we were trying to validate in those experiments..
>
>On your point of "maps are not that suitable" P4 tables tend to have
>very specific attributes (examples associated meters, counters,
>default hit and miss actions, etc).
>
>> >we can optimize over time.
>> >Your view of "single lookup" is true for simple programs but if you
>> >have 10 tables trying to model a 5G function then it doesnt make sense
>> >(and i think the data we published was clear that you gain no
>> >advantage using ebpf - as a matter of fact there was no perf
>> >difference between XDP and tc in such cases).
>> >
>> >> 2. p4c-tc control plan looks slower than a directly mmaped bpf
>> >>    map. Doing a simple update vs a netlink msg. The argument
>> >>    that BPF can't do CRUD (which we had offlist) seems incorrect
>> >>    to me. Correct me if I'm wrong with details about why.
>> >>
>> >
>> >So let me see....
>> >you want me to replace netlink and all its features and rewrite it
>> >using the ebpf system calls? Congestion control, event handling,
>> >arbitrary message crafting, etc and the years of work that went into
>> >netlink? NO to the HELL.
>>
>> Wait, I don't think John suggests anything like that. He just suggests
>> to have the tables as eBPF maps.
>
>What's the difference? Unless maps can do netlink.
>
>> Honestly, I don't understand the
>> fixation on netlink. Its socket messaging, memcpies, processing
>> overhead, etc can't keep up with mmaped memory access at scale. Measure
>> that and I bet you'll get drastically different results.
>>
>> I mean, netlink is good for a lot of things, but does not mean it is an
>> universal answer to userspace<->kernel data passing.
>
>Here's a small sample of our requirements that are satisfied by
>netlink for P4 object hierarchy[1]:
>1. Msg construction/parsing
>2. Multi-user request/response messaging

What is actually a usecase for having multiple users program p4 pipeline
in parallel?

>3. Multi-user event subscribe/publish messaging

Same here. What is the usecase for multiple users receiving p4 events?


>
>I dont think i need to provide an explanation on the differences here
>visavis what ebpf system calls provide vs what netlink provides and
>how netlink is a clear fit. If it is not clear i can give more

It is not :/


>breakdown. And of course there's more but above is a good sample.
>
>The part that is taken for granted is the control plane code and
>interaction which is an extremely important detail. P4 Abstraction
>requires hierarchies with different compiler generated encoded path
>ids etc. This ID mapping gets exacerbated by having multitudes of  P4

Why the actual eBFP mapping does not serve the same purpose as ID?
ID:mapping
1 :1
?


>programs which have different requirements. Netlink is a natural fit
>for this P4 abstraction. Not to mention the netlink/tc path (and in
>particular the ID mapping) provides a conduit for offload when that is
>needed.
>eBPF is just a tool - and the objects are intended to be generic - and
>i dont see how any of this could be achieved without retooling to make
>it more specific to P4.
>
>cheers,
>jamal
>
>
>
>>
>> >I should note: that there was an interesting talk at netdevconf 0x17
>> >where the speaker showed the challenges of dealing with ebpf on "day
>> >two" - slides or videos are not up yet, but link is:
>> >https://netdevconf.info/0x17/sessions/talk/is-scaling-ebpf-easy-yet-a-small-step-to-one-server-but-giant-leap-to-distributed-network.html
>> >The point the speaker was making is it's always easy to whip an ebpf
>> >program that can slice and dice packets and maybe even flush LEDs but
>> >the real work and challenge is in the control plane. I agree with the
>> >speaker based on my experiences. This discussion of replacing netlink
>> >with ebpf system calls is absolutely a non-starter. Let's just end the
>> >discussion and agree to disagree if you are going to keep insisting on
>> >that.
>>
>>
>> [...]

  reply	other threads:[~2023-11-20 18:10 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-16 14:59 [PATCH net-next v8 00/15] Introducing P4TC Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 01/15] net: sched: act_api: Introduce dynamic actions list Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 03/15] net/sched: act_api: Update tc_action_ops to account for dynamic actions Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 05/15] net: sched: act_api: Add support for preallocated dynamic action instances Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 06/15] net: introduce rcu_replace_pointer_rtnl Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 07/15] rtnl: add helper to check if group has listeners Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 08/15] p4tc: add P4 data types Jamal Hadi Salim
2023-11-16 16:03   ` Jiri Pirko
2023-11-17 12:01     ` Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 09/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2023-11-16 16:11   ` Jiri Pirko
2023-11-17 12:09     ` Jamal Hadi Salim
2023-11-20  8:18       ` Jiri Pirko
2023-11-20 12:48         ` Jamal Hadi Salim
2023-11-20 13:16           ` Jiri Pirko
2023-11-20 15:30             ` Jamal Hadi Salim
2023-11-20 16:25               ` Jiri Pirko
2023-11-20 18:20       ` David Ahern
2023-11-20 20:12         ` Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 10/15] p4tc: add action template create, update, delete, get, flush and dump Jamal Hadi Salim
2023-11-16 16:28   ` Jiri Pirko
2023-11-17 15:11     ` Jamal Hadi Salim
2023-11-20  8:19       ` Jiri Pirko
2023-11-20 13:45         ` Jamal Hadi Salim
2023-11-20 16:25           ` Jiri Pirko
2023-11-17  6:51   ` John Fastabend
2023-11-16 14:59 ` [PATCH net-next v8 11/15] p4tc: add template table " Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 12/15] p4tc: add runtime table entry create, update, get, delete, " Jamal Hadi Salim
2023-11-16 14:59 ` [PATCH net-next v8 13/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2023-11-17  7:09   ` John Fastabend
2023-11-19  9:14   ` kernel test robot
2023-11-20 22:28   ` kernel test robot
2023-11-16 14:59 ` [PATCH net-next v8 14/15] p4tc: add P4 classifier Jamal Hadi Salim
2023-11-17  7:17   ` John Fastabend
2023-11-16 14:59 ` [PATCH net-next v8 15/15] p4tc: Add P4 extern interface Jamal Hadi Salim
2023-11-16 16:42   ` Jiri Pirko
2023-11-17 12:14     ` Jamal Hadi Salim
2023-11-20  8:22       ` Jiri Pirko
2023-11-20 14:02         ` Jamal Hadi Salim
2023-11-20 16:27           ` Jiri Pirko
2023-11-20 19:00             ` Jamal Hadi Salim
2023-11-17  6:27 ` [PATCH net-next v8 00/15] Introducing P4TC John Fastabend
2023-11-17 12:49   ` Jamal Hadi Salim
2023-11-17 18:37     ` John Fastabend
2023-11-17 20:46       ` Jamal Hadi Salim
2023-11-20  9:39         ` Jiri Pirko
2023-11-20 14:23           ` Jamal Hadi Salim
2023-11-20 18:10             ` Jiri Pirko [this message]
2023-11-20 19:56               ` Jamal Hadi Salim
2023-11-20 20:41                 ` John Fastabend
2023-11-20 22:13                   ` Jamal Hadi Salim
2023-11-20 21:48                 ` Daniel Borkmann
2023-11-20 22:56                   ` Jamal Hadi Salim
2023-11-21 13:06                     ` Jiri Pirko
2023-11-21 13:47                       ` Jamal Hadi Salim
2023-11-21 14:19                         ` Jiri Pirko
2023-11-21 15:21                           ` Jamal Hadi Salim
2023-11-22  9:25                             ` Jiri Pirko
2023-11-22 15:14                               ` Jamal Hadi Salim
2023-11-22 18:31                                 ` Jiri Pirko
2023-11-22 18:50                                   ` John Fastabend
2023-11-22 19:35                                   ` Jamal Hadi Salim
2023-11-23  6:36                                     ` Jiri Pirko
2023-11-23 13:22                                       ` Jamal Hadi Salim
2023-11-23 13:34                                         ` Jiri Pirko
2023-11-23 13:45                                           ` Jamal Hadi Salim
2023-11-23 14:07                                             ` Jiri Pirko
2023-11-23 14:28                                               ` Jamal Hadi Salim
2023-11-23 15:27                                                 ` Jiri Pirko
2023-11-23 16:30                                                   ` Jamal Hadi Salim
2023-11-23 17:53                                                     ` Edward Cree
2023-11-23 18:09                                                       ` Jiri Pirko
2023-11-23 18:58                                                         ` Jamal Hadi Salim
2023-11-23 18:53                                                       ` Jakub Kicinski
2023-11-23 19:42                                                         ` Tom Herbert
2023-11-24 10:39                                                         ` Jiri Pirko
2023-11-23 18:04                                                     ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVuhBlYRwi8eGiSF@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=Mahesh.Shirshyad@amd.com \
    --cc=Vipin.Jain@amd.com \
    --cc=anjali.singhai@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=chris.sommers@keysight.com \
    --cc=dan.daly@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deb.chatterjee@intel.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=john.andy.fingerhut@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=khalidm@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mattyk@nvidia.com \
    --cc=mleitner@redhat.com \
    --cc=namrata.limaye@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=toke@redhat.com \
    --cc=tom@sipanda.io \
    --cc=tomasz.osinski@intel.com \
    --cc=vladbu@nvidia.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox