From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kui-Feng Lee <sinquersw@gmail.com>
Cc: netdev@vger.kernel.org, razor@blackwall.org, ast@kernel.org,
andrii@kernel.org, john.fastabend@gmail.com, sdf@google.com,
toke@kernel.org, kuba@kernel.org, andrew@lunn.ch,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
bpf@vger.kernel.org, "Daniel Borkmann" <daniel@iogearbox.net>
Subject: Re: [PATCH bpf-next v4 1/7] netkit, bpf: Add bpf programmable net device
Date: Thu, 26 Oct 2023 11:46:21 -0700 [thread overview]
Message-ID: <a14a83e9-e159-3ee0-782b-c4caf7c25428@linux.dev> (raw)
In-Reply-To: <d61d1de0-b8d9-42c2-bc6d-bcdd9bef2abf@gmail.com>
On 10/26/23 10:47 AM, Kui-Feng Lee wrote:
>
>
> On 10/25/23 23:20, Daniel Borkmann wrote:
>> Hi Kui-Feng,
>>
>> On 10/26/23 3:18 AM, Kui-Feng Lee wrote:
>>> On 10/25/23 18:15, Kui-Feng Lee wrote:
>>>> On 10/25/23 15:09, Martin KaFai Lau wrote:
>>>>> On 10/25/23 2:24 PM, Kui-Feng Lee wrote:
>>>>>> On 10/24/23 14:48, Daniel Borkmann wrote:
>>>>>>> This work adds a new, minimal BPF-programmable device called "netkit"
>>>>>>> (former PoC code-name "meta") we recently presented at LSF/MM/BPF. The
>>>>>>> core idea is that BPF programs are executed within the drivers xmit routine
>>>>>>> and therefore e.g. in case of containers/Pods moving BPF processing closer
>>>>>>> to the source.
>>>>>>
>>>>>> Sorry for intruding into this discussion! Although it is too late to
>>>>>> mentioned this since this patchset have been v4 already.
>>>>>>
>>>>>> I notice netkit has introduced a new attach type. I wonder if it
>>>>>> possible to implement it as a new struct_ops type.
>>>>>
>>>>> Could your elaborate more about what does this struct_ops type do and how
>>>>> is it different from the SCHED_CLS bpf prog that the netkit is running?
>>>>
>>>> I found the code has been landed.
>>>> Basing on the landed code and
>>>> the patchset of registering bpf struct_ops from modules that I
>>>> am working on, it will looks like what is done in following patch.
>>>> No changes on syscall, uapi and libbpf are required.
>>>>
>>>> diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
>>>> index 7e484f9fd3ae..e4eafaf397bf 100644
>>>> --- a/drivers/net/netkit.c
>>>> +++ b/drivers/net/netkit.c
>>>> @@ -20,6 +20,7 @@ struct netkit {
>>>> struct bpf_mprog_entry __rcu *active;
>>>> enum netkit_action policy;
>>>> struct bpf_mprog_bundle bundle;
>>>> + struct hlist_head ops_list;
>>>>
>>>> /* Needed in slow-path */
>>>> enum netkit_mode mode;
>>>> @@ -27,6 +28,13 @@ struct netkit {
>>>> u32 headroom;
>>>> };
>>>>
>>>> +struct netkit_ops {
>>>> + struct hlist_node node;
>>>> + int ifindex;
>>>> +
>>>> + int (*xmit)(struct sk_buff *skb);
>>>> +};
>>>> +
>>>> struct netkit_link {
>>>> struct bpf_link link;
>>>> struct net_device *dev;
>>>> @@ -46,6 +54,22 @@ netkit_run(const struct bpf_mprog_entry *entry, struct
>>>> sk_buff *skb,
>>>> if (ret != NETKIT_NEXT)
>>>> break;
>>>> }
>>>> +
>>>> + return ret;
>>>> +}
>>>> +
>>>> +static __always_inline int
>>>> +netkit_run_st_ops(const struct netkit *nk, struct sk_buff *skb,
>>>> + enum netkit_action ret)
>>>> +{
>>>> + struct netkit_ops *ops;
>>>> +
>>>> + hlist_for_each_entry_rcu(ops, &nk->ops_list, node) {
>>>> + ret = ops->xmit(skb);
>>>> + if (ret != NETKIT_NEXT)
>>>> + break;
>>>> + }
>>>> +
>>>> return ret;
>>>> }
>>>>
>>>> @@ -80,6 +104,8 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb,
>>>> struct net_device *dev)
>>>> entry = rcu_dereference(nk->active);
>>>> if (entry)
>>>> ret = netkit_run(entry, skb, ret);
>>>> + if (ret == NETKIT_NEXT)
>>>> + ret = netkit_run_st_ops(nk, skb, ret);
>>>> switch (ret) {
>>>> case NETKIT_NEXT:
>>>> case NETKIT_PASS:
>>
>> I don't think it makes sense to cramp struct ops in here for what has been
>> solved already with the bpf_mprog interface in a more efficient way and with
>> control dependencies for the insertion (before/after relative programs/links).
>> The latter is in particular crucial for a multi-user interface when dealing
>> with network traffic (think for example: policy, forwarder, observability
>> prog, etc).
>>
>
> I don't mean to cramp two implementations together
> and don't notice this patchset is already landed at beginning.
There are a few ways to track this. patchwork bot will send a landing message to
the list. There is a few mins lag time but I don't think this lags matter here.
You may want to check your inbox and ensure it gets through.
git always has the source of true also.
> This patch is just for explanation of how it likes if it is implemented
> with just struct_ops (without bpf_mprog).
Thanks for sharing a struct_ops code snippet. It is an interesting idea to embed
ifindex and other details in the struct.
Leaving it still needs verifier changes to make the PTR_TO_BTF_ID skb in
struct_ops to work like tc __sk_buff such that all existing tc-bpf prog will
work as is. Daniel has already mentioned the ordering API (bpf_mprog) that has
been discussed for a year and has already been used in tc-link which I hope it
will be extended to solve the xdp ordering also. I am also not convinced saving
two attach types (note the prog type is the same here) deserve to re-create
something in-parallel to tc-link and then require the same "skb" bpf dataplane
program to be administrated (attach/introspect...etc) differently.
next prev parent reply other threads:[~2023-10-26 18:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-24 21:48 [PATCH bpf-next v4 0/7] Add bpf programmable net device Daniel Borkmann
2023-10-24 21:48 ` [PATCH bpf-next v4 1/7] netkit, bpf: " Daniel Borkmann
2023-10-25 15:47 ` Jiri Pirko
2023-10-25 17:20 ` Daniel Borkmann
2023-10-26 5:18 ` Jiri Pirko
2023-10-26 12:11 ` Daniel Borkmann
2023-10-25 19:21 ` Nikolay Aleksandrov
2023-10-26 5:26 ` Jiri Pirko
2023-10-26 6:21 ` Nikolay Aleksandrov
2023-10-25 21:24 ` Kui-Feng Lee
2023-10-25 22:09 ` Martin KaFai Lau
2023-10-26 1:15 ` Kui-Feng Lee
2023-10-26 1:18 ` Kui-Feng Lee
2023-10-26 6:20 ` Daniel Borkmann
2023-10-26 17:47 ` Kui-Feng Lee
2023-10-26 18:46 ` Martin KaFai Lau [this message]
2023-10-24 21:48 ` [PATCH bpf-next v4 2/7] tools: Sync if_link uapi header Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 3/7] libbpf: Add link-based API for netkit Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 4/7] bpftool: Implement link show support " Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 5/7] bpftool: Extend net dump with netkit progs Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 6/7] selftests/bpf: Add netlink helper library Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 7/7] selftests/bpf: Add selftests for netkit Daniel Borkmann
2023-10-24 22:45 ` [PATCH bpf-next v4 0/7] Add bpf programmable net device Martin KaFai Lau
2023-10-24 23:50 ` patchwork-bot+netdevbpf
2023-10-25 15:50 ` Jiri Pirko
2023-10-25 16:54 ` Martin KaFai Lau
2023-10-26 5:35 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a14a83e9-e159-3ee0-782b-c4caf7c25428@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrew@lunn.ch \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=razor@blackwall.org \
--cc=sdf@google.com \
--cc=sinquersw@gmail.com \
--cc=toke@kernel.org \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.