Re: [PATCH bpf-next v4 1/7] netkit, bpf: Add bpf programmable net device

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kui-Feng Lee <sinquersw@gmail.com>
Cc: netdev@vger.kernel.org, razor@blackwall.org, ast@kernel.org,
	andrii@kernel.org, john.fastabend@gmail.com, sdf@google.com,
	toke@kernel.org, kuba@kernel.org, andrew@lunn.ch,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	bpf@vger.kernel.org, "Daniel Borkmann" <daniel@iogearbox.net>
Subject: Re: [PATCH bpf-next v4 1/7] netkit, bpf: Add bpf programmable net device
Date: Thu, 26 Oct 2023 11:46:21 -0700	[thread overview]
Message-ID: <a14a83e9-e159-3ee0-782b-c4caf7c25428@linux.dev> (raw)
In-Reply-To: <d61d1de0-b8d9-42c2-bc6d-bcdd9bef2abf@gmail.com>

On 10/26/23 10:47 AM, Kui-Feng Lee wrote:
> 
> 
> On 10/25/23 23:20, Daniel Borkmann wrote:
>> Hi Kui-Feng,
>>
>> On 10/26/23 3:18 AM, Kui-Feng Lee wrote:
>>> On 10/25/23 18:15, Kui-Feng Lee wrote:
>>>> On 10/25/23 15:09, Martin KaFai Lau wrote:
>>>>> On 10/25/23 2:24 PM, Kui-Feng Lee wrote:
>>>>>> On 10/24/23 14:48, Daniel Borkmann wrote:
>>>>>>> This work adds a new, minimal BPF-programmable device called "netkit"
>>>>>>> (former PoC code-name "meta") we recently presented at LSF/MM/BPF. The
>>>>>>> core idea is that BPF programs are executed within the drivers xmit routine
>>>>>>> and therefore e.g. in case of containers/Pods moving BPF processing closer
>>>>>>> to the source.
>>>>>>
>>>>>> Sorry for intruding into this discussion! Although it is too late to
>>>>>> mentioned this since this patchset have been v4 already.
>>>>>>
>>>>>> I notice netkit has introduced a new attach type. I wonder if it
>>>>>> possible to implement it as a new struct_ops type.
>>>>>
>>>>> Could your elaborate more about what does this struct_ops type do and how 
>>>>> is it different from the SCHED_CLS bpf prog that the netkit is running?
>>>>
>>>> I found the code has been landed.
>>>> Basing on the landed code and
>>>> the patchset of registering bpf struct_ops from modules that I
>>>> am working on, it will looks like what is done in following patch.
>>>> No changes on syscall, uapi and libbpf are required.
>>>>
>>>> diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
>>>> index 7e484f9fd3ae..e4eafaf397bf 100644
>>>> --- a/drivers/net/netkit.c
>>>> +++ b/drivers/net/netkit.c
>>>> @@ -20,6 +20,7 @@ struct netkit {
>>>>       struct bpf_mprog_entry __rcu *active;
>>>>       enum netkit_action policy;
>>>>       struct bpf_mprog_bundle    bundle;
>>>> +    struct hlist_head ops_list;
>>>>
>>>>       /* Needed in slow-path */
>>>>       enum netkit_mode mode;
>>>> @@ -27,6 +28,13 @@ struct netkit {
>>>>       u32 headroom;
>>>>   };
>>>>
>>>> +struct netkit_ops {
>>>> +    struct hlist_node node;
>>>> +    int ifindex;
>>>> +
>>>> +    int (*xmit)(struct sk_buff *skb);
>>>> +};
>>>> +
>>>>   struct netkit_link {
>>>>       struct bpf_link link;
>>>>       struct net_device *dev;
>>>> @@ -46,6 +54,22 @@ netkit_run(const struct bpf_mprog_entry *entry, struct 
>>>> sk_buff *skb,
>>>>           if (ret != NETKIT_NEXT)
>>>>               break;
>>>>       }
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static __always_inline int
>>>> +netkit_run_st_ops(const struct netkit *nk, struct sk_buff *skb,
>>>> +       enum netkit_action ret)
>>>> +{
>>>> +    struct netkit_ops *ops;
>>>> +
>>>> +    hlist_for_each_entry_rcu(ops, &nk->ops_list, node) {
>>>> +        ret = ops->xmit(skb);
>>>> +        if (ret != NETKIT_NEXT)
>>>> +            break;
>>>> +    }
>>>> +
>>>>       return ret;
>>>>   }
>>>>
>>>> @@ -80,6 +104,8 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, 
>>>> struct net_device *dev)
>>>>       entry = rcu_dereference(nk->active);
>>>>       if (entry)
>>>>           ret = netkit_run(entry, skb, ret);
>>>> +    if (ret == NETKIT_NEXT)
>>>> +        ret = netkit_run_st_ops(nk, skb, ret);
>>>>       switch (ret) {
>>>>       case NETKIT_NEXT:
>>>>       case NETKIT_PASS:
>>
>> I don't think it makes sense to cramp struct ops in here for what has been
>> solved already with the bpf_mprog interface in a more efficient way and with
>> control dependencies for the insertion (before/after relative programs/links).
>> The latter is in particular crucial for a multi-user interface when dealing
>> with network traffic (think for example: policy, forwarder, observability
>> prog, etc).
>>
> 
> I don't mean to cramp two implementations together
> and don't notice this patchset is already landed at beginning.

There are a few ways to track this. patchwork bot will send a landing message to 
the list. There is a few mins lag time but I don't think this lags matter here. 
You may want to check your inbox and ensure it gets through.

git always has the source of true also.

> This patch is just for explanation of how it likes if it is implemented
> with just struct_ops (without bpf_mprog).

Thanks for sharing a struct_ops code snippet. It is an interesting idea to embed 
ifindex and other details in the struct.

Leaving it still needs verifier changes to make the PTR_TO_BTF_ID skb in 
struct_ops to work like tc __sk_buff such that all existing tc-bpf prog will 
work as is. Daniel has already mentioned the ordering API (bpf_mprog) that has 
been discussed for a year and has already been used in tc-link which I hope it 
will be extended to solve the xdp ordering also. I am also not convinced saving 
two attach types (note the prog type is the same here) deserve to re-create 
something in-parallel to tc-link and then require the same "skb" bpf dataplane 
program to be administrated (attach/introspect...etc) differently.

next prev parent reply	other threads:[~2023-10-26 18:46 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-24 21:48 [PATCH bpf-next v4 0/7] Add bpf programmable net device Daniel Borkmann
2023-10-24 21:48 ` [PATCH bpf-next v4 1/7] netkit, bpf: " Daniel Borkmann
2023-10-25 15:47   ` Jiri Pirko
2023-10-25 17:20     ` Daniel Borkmann
2023-10-26  5:18       ` Jiri Pirko
2023-10-26 12:11         ` Daniel Borkmann
2023-10-25 19:21     ` Nikolay Aleksandrov
2023-10-26  5:26       ` Jiri Pirko
2023-10-26  6:21         ` Nikolay Aleksandrov
2023-10-25 21:24   ` Kui-Feng Lee
2023-10-25 22:09     ` Martin KaFai Lau
2023-10-26  1:15       ` Kui-Feng Lee
2023-10-26  1:18         ` Kui-Feng Lee
2023-10-26  6:20           ` Daniel Borkmann
2023-10-26 17:47             ` Kui-Feng Lee
2023-10-26 18:46               ` Martin KaFai Lau [this message]
2023-10-24 21:48 ` [PATCH bpf-next v4 2/7] tools: Sync if_link uapi header Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 3/7] libbpf: Add link-based API for netkit Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 4/7] bpftool: Implement link show support " Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 5/7] bpftool: Extend net dump with netkit progs Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 6/7] selftests/bpf: Add netlink helper library Daniel Borkmann
2023-10-24 21:49 ` [PATCH bpf-next v4 7/7] selftests/bpf: Add selftests for netkit Daniel Borkmann
2023-10-24 22:45 ` [PATCH bpf-next v4 0/7] Add bpf programmable net device Martin KaFai Lau
2023-10-24 23:50 ` patchwork-bot+netdevbpf
2023-10-25 15:50   ` Jiri Pirko
2023-10-25 16:54     ` Martin KaFai Lau
2023-10-26  5:35       ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a14a83e9-e159-3ee0-782b-c4caf7c25428@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrew@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=razor@blackwall.org \
    --cc=sdf@google.com \
    --cc=sinquersw@gmail.com \
    --cc=toke@kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).