From: Martin KaFai Lau <martin.lau@linux.dev>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: deb.chatterjee@intel.com, anjali.singhai@intel.com,
namrata.limaye@intel.com, mleitner@redhat.com,
Mahesh.Shirshyad@amd.com, tomasz.osinski@intel.com,
jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
vladbu@nvidia.com, horms@kernel.org, khalidm@nvidia.com,
daniel@iogearbox.net, bpf@vger.kernel.org,
netdev@vger.kernel.org, Jamal Hadi Salim <jhs@mojatatu.com>
Subject: Re: [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs
Date: Fri, 8 Dec 2023 12:07:27 -0800 [thread overview]
Message-ID: <335fbd65-585d-47b8-a98f-c0898aff7d7f@linux.dev> (raw)
In-Reply-To: <87lea5j8ys.fsf@toke.dk>
On 12/8/23 2:15 AM, Toke Høiland-Jørgensen wrote:
> Martin KaFai Lau <martin.lau@linux.dev> writes:
>
>> On 12/1/23 10:29 AM, Jamal Hadi Salim wrote:
>>> We add an initial set of kfuncs to allow interactions from eBPF programs
>>> to the P4TC domain.
>>>
>>> - bpf_p4tc_tbl_read: Used to lookup a table entry from a BPF
>>> program installed in TC. To find the table entry we take in an skb, the
>>> pipeline ID, the table ID, a key and a key size.
>>> We use the skb to get the network namespace structure where all the
>>> pipelines are stored. After that we use the pipeline ID and the table
>>> ID, to find the table. We then use the key to search for the entry.
>>> We return an entry on success and NULL on failure.
>>>
>>> - xdp_p4tc_tbl_read: Used to lookup a table entry from a BPF
>>> program installed in XDP. To find the table entry we take in an xdp_md,
>>> the pipeline ID, the table ID, a key and a key size.
>>> We use struct xdp_md to get the network namespace structure where all
>>> the pipelines are stored. After that we use the pipeline ID and the table
>>> ID, to find the table. We then use the key to search for the entry.
>>> We return an entry on success and NULL on failure.
>>>
>>> - bpf_p4tc_entry_create: Used to create a table entry from a BPF
>>> program installed in TC. To create the table entry we take an skb, the
>>> pipeline ID, the table ID, a key and its size, and an action which will
>>> be associated with the new entry.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - xdp_p4tc_entry_create: Used to create a table entry from a BPF
>>> program installed in XDP. To create the table entry we take an xdp_md, the
>>> pipeline ID, the table ID, a key and its size, and an action which will
>>> be associated with the new entry.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - bpf_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
>>> First does a lookup using the passed key and upon a miss will add the entry
>>> to the table.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - xdp_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
>>> First does a lookup using the passed key and upon a miss will add the entry
>>> to the table.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - bpf_p4tc_entry_update: Used to update a table entry from a BPF
>>> program installed in TC. To update the table entry we take an skb, the
>>> pipeline ID, the table ID, a key and its size, and an action which will
>>> be associated with the new entry.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - xdp_p4tc_entry_update: Used to update a table entry from a BPF
>>> program installed in XDP. To update the table entry we take an xdp_md, the
>>> pipeline ID, the table ID, a key and its size, and an action which will
>>> be associated with the new entry.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - bpf_p4tc_entry_delete: Used to delete a table entry from a BPF
>>> program installed in TC. To delete the table entry we take an skb, the
>>> pipeline ID, the table ID, a key and a key size.
>>> We return 0 on success and a negative errno on failure
>>>
>>> - xdp_p4tc_entry_delete: Used to delete a table entry from a BPF
>>> program installed in XDP. To delete the table entry we take an xdp_md, the
>>> pipeline ID, the table ID, a key and a key size.
>>> We return 0 on success and a negative errno on failure
>>
>> [ ... ]
>>
>>> +BTF_SET8_START(p4tc_kfunc_check_tbl_set_skb)
>>> +BTF_ID_FLAGS(func, bpf_p4tc_tbl_read, KF_RET_NULL);
>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create);
>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create_on_miss);
>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_update);
>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_delete);
>>> +BTF_SET8_END(p4tc_kfunc_check_tbl_set_skb)
>>
>> These create/read/update/delete kfuncs are like defining a new hidden bpf map
>> type in the kernel. bpf prog can now create its own link-list and rbtree.
>> sched_ext has already been using it. This is the way the bpf prog should use
>> instead of creating a new map type.
>
> I don't really think this is an accurate assessment, given Jamal's use
> case. These kfuncs are more akin to the FIB lookup helper, or the
> netfilter kfuncs: they provide lookup into a kernel-internal data
> structure, so that BPF can access that data structure while staying in
> sync with the rest of the kernel.
>
> If this was a BPF-only implementation you'd be right, but given the
> constraint of having the P4 objects represented in the kernel[0], I
> think this is a perfectly reasonable use of kfuncs, even though they
> happen to look like the map API.
>
> -Toke
>
> [0] Whether having those objects represented at all is reasonable is a
> separate discussion, which I believe John et al are having with Jamal in
> a separate subthread. I don't personally have any strong objections to
> doing that.
I might not be clear. It was my question on why it has to be in the kernel
instead of in the bpf map, so the earlier bpf link-list and rbtree example just
in case this recent bpf capability has not been considered.
If it is an existing kernel infra-structure, kfunc is a reasonable use.
The P4 objects are newly added to this set with bpf program as its user. It can
be represented in the bpf map as well instead of in the kernel.
or is it fair to say that bpf prog is not the primary consumer of the P4
objects. Instead kernel is the primary user of the p4 objects such that p4tc can
work independently without the bpf piece to begin with and bpf could be
considered as an extension later?
next prev parent reply other threads:[~2023-12-08 20:07 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-01 18:28 [PATCH net-next v9 00/15] Introducing P4TC Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 01/15] net: sched: act_api: Introduce P4 actions list Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 03/15] net/sched: act_api: Update tc_action_ops to account for P4 actions Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 05/15] net: sched: act_api: Add support for preallocated P4 action instances Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 06/15] net: introduce rcu_replace_pointer_rtnl Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 07/15] rtnl: add helper to check if group has listeners Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 08/15] p4tc: add P4 data types Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 09/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2023-12-01 18:28 ` [PATCH net-next v9 10/15] p4tc: add action template create, update, delete, get, flush and dump Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 11/15] p4tc: add P4 action runtime support Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 12/15] p4tc: add template table create, update, delete, get, flush and dump Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 13/15] p4tc: add runtime table entry create, update, get, delete, " Jamal Hadi Salim
2023-12-06 5:34 ` Dan Carpenter
2023-12-06 15:08 ` Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2023-12-08 7:33 ` Martin KaFai Lau
2023-12-08 10:15 ` Toke Høiland-Jørgensen
2023-12-08 20:07 ` Martin KaFai Lau [this message]
2023-12-11 15:00 ` Toke Høiland-Jørgensen
2023-12-11 15:18 ` Jamal Hadi Salim
2023-12-01 18:29 ` [PATCH net-next v9 15/15] p4tc: add P4 classifier Jamal Hadi Salim
2023-12-05 0:32 ` John Fastabend
2023-12-05 13:43 ` Daniel Borkmann
2023-12-05 16:23 ` Jamal Hadi Salim
2023-12-05 22:32 ` Daniel Borkmann
2023-12-06 14:59 ` Jamal Hadi Salim
2023-12-08 10:06 ` Daniel Borkmann
2023-12-11 15:43 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=335fbd65-585d-47b8-a98f-c0898aff7d7f@linux.dev \
--to=martin.lau@linux.dev \
--cc=Mahesh.Shirshyad@amd.com \
--cc=anjali.singhai@intel.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=deb.chatterjee@intel.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=khalidm@nvidia.com \
--cc=kuba@kernel.org \
--cc=mleitner@redhat.com \
--cc=namrata.limaye@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=toke@redhat.com \
--cc=tomasz.osinski@intel.com \
--cc=vladbu@nvidia.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).