netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamal Hadi Salim <jhs@mojatatu.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>,
	deb.chatterjee@intel.com,  anjali.singhai@intel.com,
	namrata.limaye@intel.com, mleitner@redhat.com,
	 Mahesh.Shirshyad@amd.com, tomasz.osinski@intel.com,
	jiri@resnulli.us,  xiyou.wangcong@gmail.com, davem@davemloft.net,
	edumazet@google.com,  kuba@kernel.org, pabeni@redhat.com,
	vladbu@nvidia.com, horms@kernel.org,  khalidm@nvidia.com,
	daniel@iogearbox.net, bpf@vger.kernel.org,
	 netdev@vger.kernel.org
Subject: Re: [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs
Date: Mon, 11 Dec 2023 10:18:24 -0500	[thread overview]
Message-ID: <CAM0EoMnH4kgogD4raKjNeAL6XiNEcVU201_1-W5nNYiX-JMwsQ@mail.gmail.com> (raw)
In-Reply-To: <87plzc239o.fsf@toke.dk>

On Mon, Dec 11, 2023 at 10:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Martin KaFai Lau <martin.lau@linux.dev> writes:
>
> > On 12/8/23 2:15 AM, Toke Høiland-Jørgensen wrote:
> >> Martin KaFai Lau <martin.lau@linux.dev> writes:
> >>
> >>> On 12/1/23 10:29 AM, Jamal Hadi Salim wrote:
> >>>> We add an initial set of kfuncs to allow interactions from eBPF programs
> >>>> to the P4TC domain.
> >>>>
> >>>> - bpf_p4tc_tbl_read: Used to lookup a table entry from a BPF
> >>>> program installed in TC. To find the table entry we take in an skb, the
> >>>> pipeline ID, the table ID, a key and a key size.
> >>>> We use the skb to get the network namespace structure where all the
> >>>> pipelines are stored. After that we use the pipeline ID and the table
> >>>> ID, to find the table. We then use the key to search for the entry.
> >>>> We return an entry on success and NULL on failure.
> >>>>
> >>>> - xdp_p4tc_tbl_read: Used to lookup a table entry from a BPF
> >>>> program installed in XDP. To find the table entry we take in an xdp_md,
> >>>> the pipeline ID, the table ID, a key and a key size.
> >>>> We use struct xdp_md to get the network namespace structure where all
> >>>> the pipelines are stored. After that we use the pipeline ID and the table
> >>>> ID, to find the table. We then use the key to search for the entry.
> >>>> We return an entry on success and NULL on failure.
> >>>>
> >>>> - bpf_p4tc_entry_create: Used to create a table entry from a BPF
> >>>> program installed in TC. To create the table entry we take an skb, the
> >>>> pipeline ID, the table ID, a key and its size, and an action which will
> >>>> be associated with the new entry.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - xdp_p4tc_entry_create: Used to create a table entry from a BPF
> >>>> program installed in XDP. To create the table entry we take an xdp_md, the
> >>>> pipeline ID, the table ID, a key and its size, and an action which will
> >>>> be associated with the new entry.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - bpf_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
> >>>> First does a lookup using the passed key and upon a miss will add the entry
> >>>> to the table.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - xdp_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
> >>>> First does a lookup using the passed key and upon a miss will add the entry
> >>>> to the table.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - bpf_p4tc_entry_update: Used to update a table entry from a BPF
> >>>> program installed in TC. To update the table entry we take an skb, the
> >>>> pipeline ID, the table ID, a key and its size, and an action which will
> >>>> be associated with the new entry.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - xdp_p4tc_entry_update: Used to update a table entry from a BPF
> >>>> program installed in XDP. To update the table entry we take an xdp_md, the
> >>>> pipeline ID, the table ID, a key and its size, and an action which will
> >>>> be associated with the new entry.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - bpf_p4tc_entry_delete: Used to delete a table entry from a BPF
> >>>> program installed in TC. To delete the table entry we take an skb, the
> >>>> pipeline ID, the table ID, a key and a key size.
> >>>> We return 0 on success and a negative errno on failure
> >>>>
> >>>> - xdp_p4tc_entry_delete: Used to delete a table entry from a BPF
> >>>> program installed in XDP. To delete the table entry we take an xdp_md, the
> >>>> pipeline ID, the table ID, a key and a key size.
> >>>> We return 0 on success and a negative errno on failure
> >>>
> >>> [ ... ]
> >>>
> >>>> +BTF_SET8_START(p4tc_kfunc_check_tbl_set_skb)
> >>>> +BTF_ID_FLAGS(func, bpf_p4tc_tbl_read, KF_RET_NULL);
> >>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create);
> >>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create_on_miss);
> >>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_update);
> >>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_delete);
> >>>> +BTF_SET8_END(p4tc_kfunc_check_tbl_set_skb)
> >>>
> >>> These create/read/update/delete kfuncs are like defining a new hidden bpf map
> >>> type in the kernel. bpf prog can now create its own link-list and rbtree.
> >>> sched_ext has already been using it. This is the way the bpf prog should use
> >>> instead of creating a new map type.
> >>
> >> I don't really think this is an accurate assessment, given Jamal's use
> >> case. These kfuncs are more akin to the FIB lookup helper, or the
> >> netfilter kfuncs: they provide lookup into a kernel-internal data
> >> structure, so that BPF can access that data structure while staying in
> >> sync with the rest of the kernel.
> >>
> >> If this was a BPF-only implementation you'd be right, but given the
> >> constraint of having the P4 objects represented in the kernel[0], I
> >> think this is a perfectly reasonable use of kfuncs, even though they
> >> happen to look like the map API.
> >>
> >> -Toke
> >>
> >> [0] Whether having those objects represented at all is reasonable is a
> >> separate discussion, which I believe John et al are having with Jamal in
> >> a separate subthread. I don't personally have any strong objections to
> >> doing that.
> >
> > I might not be clear. It was my question on why it has to be in the kernel
> > instead of in the bpf map, so the earlier bpf link-list and rbtree example just
> > in case this recent bpf capability has not been considered.
>
> A bit tangential, but it came to mind while thinking about this: how
> would one go about updating a bpf rbtree-based data structure from
> userspace? Is there a way to get bpf_map_update()-semantics that inserts
> things into the rbtree somehow?
>
> > If it is an existing kernel infra-structure, kfunc is a reasonable use.
> >
> > The P4 objects are newly added to this set with bpf program as its user. It can
> > be represented in the bpf map as well instead of in the kernel.
> >
> > or is it fair to say that bpf prog is not the primary consumer of the P4
> > objects. Instead kernel is the primary user of the p4 objects such that p4tc can
> > work independently without the bpf piece to begin with and bpf could be
> > considered as an extension later?
>
> That's a good question, actually. I think that conceptually, if viewed
> purely as a control plane, it could be merged separately and the BPF
> support added later. But with this series, that would make it a control
> plane that doesn't really control anything; so there would need to be a
> second consumer (hardware offload?) added for that to make sense, I
> suppose.
>
> Or to put it another way, the way this series is designed, there is an
> implicit "these are kernel objects that we want to use for other things"
> assumption in there; it's just that those "other things" are not part
> of this series (because hardware offload doesn't exist yet - I think?
> I'll let Jamal answer that). I can see the point of asking for that
> second user, though, as that would make it clear why the control plane
> needs to be in the kernel.

Yes, HW is also a consumer via TC but these patches are for the s/w only piece.

Martin, to get some understanding, see the last slide on
https://netdevconf.info/0x17/sessions/talk/integrating-ebpf-into-the-p4tc-datapath.html
If you want to see the history of what changed from V1 for the s/w
side, see slide #14. And if you have the patience, look at the whole
slide set, and read the abstract. If clarity is still lacking then the
cover letter. And if you have even more patience then the relevant
commit messages but more importantly we need help to review if we are
making any amateur mistakes for the kfunc piece. TC (which is used for
such offloads, see u32, flower, etc) has its own model for
config/control driven via netlink. There are more objects than just
tables (eg pipelines, actions, externs etc) that all are _attached to
netns_ where they were instantiated - so we have a few more kfuncs for
those objects which are not part of this patchset (since the rules say
15 patches is the max).

Hope that helps.

cheers,
jamal
> -Toke
>

  reply	other threads:[~2023-12-11 15:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-01 18:28 [PATCH net-next v9 00/15] Introducing P4TC Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 01/15] net: sched: act_api: Introduce P4 actions list Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 03/15] net/sched: act_api: Update tc_action_ops to account for P4 actions Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 05/15] net: sched: act_api: Add support for preallocated P4 action instances Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 06/15] net: introduce rcu_replace_pointer_rtnl Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 07/15] rtnl: add helper to check if group has listeners Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 08/15] p4tc: add P4 data types Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 09/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 10/15] p4tc: add action template create, update, delete, get, flush and dump Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 11/15] p4tc: add P4 action runtime support Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 12/15] p4tc: add template table create, update, delete, get, flush and dump Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 13/15] p4tc: add runtime table entry create, update, get, delete, " Jamal Hadi Salim
2023-12-06  5:34   ` Dan Carpenter
2023-12-06 15:08     ` Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2023-12-08  7:33   ` Martin KaFai Lau
2023-12-08 10:15     ` Toke Høiland-Jørgensen
2023-12-08 20:07       ` Martin KaFai Lau
2023-12-11 15:00         ` Toke Høiland-Jørgensen
2023-12-11 15:18           ` Jamal Hadi Salim [this message]
2023-12-01 18:29 ` [PATCH net-next v9 15/15] p4tc: add P4 classifier Jamal Hadi Salim
2023-12-05  0:32   ` John Fastabend
2023-12-05 13:43     ` Daniel Borkmann
2023-12-05 16:23       ` Jamal Hadi Salim
2023-12-05 22:32         ` Daniel Borkmann
2023-12-06 14:59           ` Jamal Hadi Salim
2023-12-08 10:06             ` Daniel Borkmann
2023-12-11 15:43               ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM0EoMnH4kgogD4raKjNeAL6XiNEcVU201_1-W5nNYiX-JMwsQ@mail.gmail.com \
    --to=jhs@mojatatu.com \
    --cc=Mahesh.Shirshyad@amd.com \
    --cc=anjali.singhai@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deb.chatterjee@intel.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jiri@resnulli.us \
    --cc=khalidm@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mleitner@redhat.com \
    --cc=namrata.limaye@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=toke@redhat.com \
    --cc=tomasz.osinski@intel.com \
    --cc=vladbu@nvidia.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).