From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Martin KaFai Lau <martin.lau@linux.dev>
Cc: deb.chatterjee@intel.com, anjali.singhai@intel.com,
namrata.limaye@intel.com, mleitner@redhat.com,
Mahesh.Shirshyad@amd.com, tomasz.osinski@intel.com,
jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
vladbu@nvidia.com, horms@kernel.org, khalidm@nvidia.com,
daniel@iogearbox.net, bpf@vger.kernel.org,
netdev@vger.kernel.org, Jamal Hadi Salim <jhs@mojatatu.com>
Subject: Re: [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs
Date: Mon, 11 Dec 2023 16:00:03 +0100 [thread overview]
Message-ID: <87plzc239o.fsf@toke.dk> (raw)
In-Reply-To: <335fbd65-585d-47b8-a98f-c0898aff7d7f@linux.dev>
Martin KaFai Lau <martin.lau@linux.dev> writes:
> On 12/8/23 2:15 AM, Toke Høiland-Jørgensen wrote:
>> Martin KaFai Lau <martin.lau@linux.dev> writes:
>>
>>> On 12/1/23 10:29 AM, Jamal Hadi Salim wrote:
>>>> We add an initial set of kfuncs to allow interactions from eBPF programs
>>>> to the P4TC domain.
>>>>
>>>> - bpf_p4tc_tbl_read: Used to lookup a table entry from a BPF
>>>> program installed in TC. To find the table entry we take in an skb, the
>>>> pipeline ID, the table ID, a key and a key size.
>>>> We use the skb to get the network namespace structure where all the
>>>> pipelines are stored. After that we use the pipeline ID and the table
>>>> ID, to find the table. We then use the key to search for the entry.
>>>> We return an entry on success and NULL on failure.
>>>>
>>>> - xdp_p4tc_tbl_read: Used to lookup a table entry from a BPF
>>>> program installed in XDP. To find the table entry we take in an xdp_md,
>>>> the pipeline ID, the table ID, a key and a key size.
>>>> We use struct xdp_md to get the network namespace structure where all
>>>> the pipelines are stored. After that we use the pipeline ID and the table
>>>> ID, to find the table. We then use the key to search for the entry.
>>>> We return an entry on success and NULL on failure.
>>>>
>>>> - bpf_p4tc_entry_create: Used to create a table entry from a BPF
>>>> program installed in TC. To create the table entry we take an skb, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_create: Used to create a table entry from a BPF
>>>> program installed in XDP. To create the table entry we take an xdp_md, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - bpf_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
>>>> First does a lookup using the passed key and upon a miss will add the entry
>>>> to the table.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
>>>> First does a lookup using the passed key and upon a miss will add the entry
>>>> to the table.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - bpf_p4tc_entry_update: Used to update a table entry from a BPF
>>>> program installed in TC. To update the table entry we take an skb, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_update: Used to update a table entry from a BPF
>>>> program installed in XDP. To update the table entry we take an xdp_md, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - bpf_p4tc_entry_delete: Used to delete a table entry from a BPF
>>>> program installed in TC. To delete the table entry we take an skb, the
>>>> pipeline ID, the table ID, a key and a key size.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_delete: Used to delete a table entry from a BPF
>>>> program installed in XDP. To delete the table entry we take an xdp_md, the
>>>> pipeline ID, the table ID, a key and a key size.
>>>> We return 0 on success and a negative errno on failure
>>>
>>> [ ... ]
>>>
>>>> +BTF_SET8_START(p4tc_kfunc_check_tbl_set_skb)
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_tbl_read, KF_RET_NULL);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create_on_miss);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_update);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_delete);
>>>> +BTF_SET8_END(p4tc_kfunc_check_tbl_set_skb)
>>>
>>> These create/read/update/delete kfuncs are like defining a new hidden bpf map
>>> type in the kernel. bpf prog can now create its own link-list and rbtree.
>>> sched_ext has already been using it. This is the way the bpf prog should use
>>> instead of creating a new map type.
>>
>> I don't really think this is an accurate assessment, given Jamal's use
>> case. These kfuncs are more akin to the FIB lookup helper, or the
>> netfilter kfuncs: they provide lookup into a kernel-internal data
>> structure, so that BPF can access that data structure while staying in
>> sync with the rest of the kernel.
>>
>> If this was a BPF-only implementation you'd be right, but given the
>> constraint of having the P4 objects represented in the kernel[0], I
>> think this is a perfectly reasonable use of kfuncs, even though they
>> happen to look like the map API.
>>
>> -Toke
>>
>> [0] Whether having those objects represented at all is reasonable is a
>> separate discussion, which I believe John et al are having with Jamal in
>> a separate subthread. I don't personally have any strong objections to
>> doing that.
>
> I might not be clear. It was my question on why it has to be in the kernel
> instead of in the bpf map, so the earlier bpf link-list and rbtree example just
> in case this recent bpf capability has not been considered.
A bit tangential, but it came to mind while thinking about this: how
would one go about updating a bpf rbtree-based data structure from
userspace? Is there a way to get bpf_map_update()-semantics that inserts
things into the rbtree somehow?
> If it is an existing kernel infra-structure, kfunc is a reasonable use.
>
> The P4 objects are newly added to this set with bpf program as its user. It can
> be represented in the bpf map as well instead of in the kernel.
>
> or is it fair to say that bpf prog is not the primary consumer of the P4
> objects. Instead kernel is the primary user of the p4 objects such that p4tc can
> work independently without the bpf piece to begin with and bpf could be
> considered as an extension later?
That's a good question, actually. I think that conceptually, if viewed
purely as a control plane, it could be merged separately and the BPF
support added later. But with this series, that would make it a control
plane that doesn't really control anything; so there would need to be a
second consumer (hardware offload?) added for that to make sense, I
suppose.
Or to put it another way, the way this series is designed, there is an
implicit "these are kernel objects that we want to use for other things"
assumption in there; it's just that those "other things" are not part
of this series (because hardware offload doesn't exist yet - I think?
I'll let Jamal answer that). I can see the point of asking for that
second user, though, as that would make it clear why the control plane
needs to be in the kernel.
-Toke
next prev parent reply other threads:[~2023-12-11 15:00 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-01 18:28 [PATCH net-next v9 00/15] Introducing P4TC Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 01/15] net: sched: act_api: Introduce P4 actions list Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 03/15] net/sched: act_api: Update tc_action_ops to account for P4 actions Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 05/15] net: sched: act_api: Add support for preallocated P4 action instances Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 06/15] net: introduce rcu_replace_pointer_rtnl Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 07/15] rtnl: add helper to check if group has listeners Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 08/15] p4tc: add P4 data types Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 09/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 10/15] p4tc: add action template create, update, delete, get, flush and dump Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 11/15] p4tc: add P4 action runtime support Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 12/15] p4tc: add template table create, update, delete, get, flush and dump Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 13/15] p4tc: add runtime table entry create, update, get, delete, " Jamal Hadi Salim
2023-12-06 5:34 ` Dan Carpenter
2023-12-06 15:08 ` Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2023-12-08 7:33 ` Martin KaFai Lau
2023-12-08 10:15 ` Toke Høiland-Jørgensen
2023-12-08 20:07 ` Martin KaFai Lau
2023-12-11 15:00 ` Toke Høiland-Jørgensen [this message]
2023-12-11 15:18 ` Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 15/15] p4tc: add P4 classifier Jamal Hadi Salim
2023-12-05 0:32 ` John Fastabend
2023-12-05 13:43 ` Daniel Borkmann
2023-12-05 16:23 ` Jamal Hadi Salim
2023-12-05 22:32 ` Daniel Borkmann
2023-12-06 14:59 ` Jamal Hadi Salim
2023-12-08 10:06 ` Daniel Borkmann
2023-12-11 15:43 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87plzc239o.fsf@toke.dk \
--to=toke@redhat.com \
--cc=Mahesh.Shirshyad@amd.com \
--cc=anjali.singhai@intel.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=deb.chatterjee@intel.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=khalidm@nvidia.com \
--cc=kuba@kernel.org \
--cc=martin.lau@linux.dev \
--cc=mleitner@redhat.com \
--cc=namrata.limaye@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tomasz.osinski@intel.com \
--cc=vladbu@nvidia.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).