All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Stanislav Fomichev <sdf@google.com>
Cc: bpf <bpf@vger.kernel.org>, "Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Song Liu" <song@kernel.org>, "Yonghong Song" <yhs@fb.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@kernel.org>, "Hao Luo" <haoluo@google.com>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Willem de Bruijn" <willemb@google.com>,
	"David Ahern" <dsahern@kernel.org>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Network Development" <netdev@vger.kernel.org>
Subject: Re: [RFC bpf-next 0/7] bpf: netdev TX metadata
Date: Wed, 14 Jun 2023 13:59:57 +0200	[thread overview]
Message-ID: <877cs6l0ea.fsf@toke.dk> (raw)
In-Reply-To: <CAADnVQ+CCOw9_LbCAaFz0593eydKNb7RxnGr6_FatUOKmvPmBg@mail.gmail.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Jun 13, 2023 at 4:16 PM Stanislav Fomichev <sdf@google.com> wrote:
>>
>> On Tue, Jun 13, 2023 at 3:32 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>> >
>> > On Tue, Jun 13, 2023 at 2:17 PM Stanislav Fomichev <sdf@google.com> wrote:
>> > >
>> > > > >> >> > --- UAPI ---
>> > > > >> >> >
>> > > > >> >> > The hooks are implemented in a HID-BPF style. Meaning they don't
>> > > > >> >> > expose any UAPI and are implemented as tracing programs that call
>> > > > >> >> > a bunch of kfuncs. The attach/detach operation happen via BPF syscall
>> > > > >> >> > programs. The series expands device-bound infrastructure to tracing
>> > > > >> >> > programs.
>> > > > >> >>
>> > > > >> >> Not a fan of the "attach from BPF syscall program" thing. These are part
>> > > > >> >> of the XDP data path API, and I think we should expose them as proper
>> > > > >> >> bpf_link attachments from userspace with introspection etc. But I guess
>> > > > >> >> the bpf_mprog thing will give us that?
>> > > > >> >
>> > > > >> > bpf_mprog will just make those attach kfuncs return the link fd. The
>> > > > >> > syscall program will still stay :-(
>> > > > >>
>> > > > >> Why does the attachment have to be done this way, exactly? Couldn't we
>> > > > >> just use the regular bpf_link attachment from userspace? AFAICT it's not
>> > > > >> really piggy-backing on the function override thing anyway when the
>> > > > >> attachment is per-dev? Or am I misunderstanding how all this works?
>> > > > >
>> > > > > It's UAPI vs non-UAPI. I'm assuming kfunc makes it non-UAPI and gives
>> > > > > us an opportunity to fix things.
>> > > > > We can do it via a regular syscall path if there is a consensus.
>> > > >
>> > > > Yeah, the API exposed to the BPF program is kfunc-based in any case. If
>> > > > we were to at some point conclude that this whole thing was not useful
>> > > > at all and deprecate it, it doesn't seem to me that it makes much
>> > > > difference whether that means "you can no longer create a link
>> > > > attachment of this type via BPF_LINK_CREATE" or "you can no longer
>> > > > create a link attachment of this type via BPF_PROG_RUN of a syscall type
>> > > > program" doesn't really seem like a significant detail to me...
>> > >
>> > > In this case, why do you prefer it to go via regular syscall? Seems
>> > > like we can avoid a bunch of boileplate syscall work with a kfunc that
>> > > does the attachment?
>> > > We might as well abstract it at, say, libbpf layer which would
>> > > generate/load this small bpf program to call a kfunc.
>> >
>> > I'm not sure we're on the same page here.
>> > imo using syscall bpf prog that calls kfunc to do a per-device attach
>> > is an overkill here.
>> > It's an experimental feature, but you're already worried about
>> > multiple netdevs?
>> >
>> > Can you add an empty nop function and attach to it tracing style
>> > with fentry ?
>> > It won't be per-netdev, but do you have to do per-device demux
>> > by the kernel? Can your tracing bpf prog do that instead?
>> > It's just an ifindex compare.
>> > This way than non-uapi bits will be even smaller and no need
>> > to change struct netdevice.
>>
>> It's probably going to work if each driver has a separate set of tx
>> fentry points, something like:
>>   {veth,mlx5,etc}_devtx_submit()
>>   {veth,mlx5,etc}_devtx_complete()

I really don't get the opposition to exposing proper APIs; as a
dataplane developer I want to attach a program to an interface. The
kernel's role is to provide a consistent interface for this, not to
require users to become driver developers just to get at the required
details.

> Right. And per-driver descriptors.
> The 'generic' xdptx metadata is unlikely to be practical.
> Marshaling in and out of it is going to be too perf sensitive.
> I'd just add an attach point in the driver with enough
> args for bpf progs to make sense of the context and extend
> the verifier to make few safe fields writeable.

This is a rehashing of the argument we had on the RX side: just exposing
descriptors is a bad API because it forces BPF programs to deal with
hardware errata - which is exactly the kind of thing that belongs in a
driver. Exposing kfuncs allows drivers to deal with hardware quirks
while exposing a consistent API to (BPF) users.

> kfuncs to read/request timestamp are probably too slow.

Which is why we should be inlining them :)

-Toke

  reply	other threads:[~2023-06-14 12:00 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-12 17:23 [RFC bpf-next 0/7] bpf: netdev TX metadata Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 1/7] bpf: rename some xdp-metadata functions into dev-bound Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 2/7] bpf: resolve single typedef when walking structs Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 3/7] bpf: implement devtx hook points Stanislav Fomichev
2023-06-13  2:09   ` kernel test robot
2023-06-13 19:00     ` Stanislav Fomichev
2023-06-13  2:20   ` kernel test robot
2023-06-13  3:11   ` kernel test robot
2023-06-13  4:24   ` kernel test robot
2023-06-13 14:54   ` Willem de Bruijn
2023-06-13 19:00     ` Stanislav Fomichev
2023-06-13 19:29       ` Willem de Bruijn
2023-06-13 15:08   ` Simon Horman
2023-06-13 19:00     ` Stanislav Fomichev
2023-06-14  7:02       ` Simon Horman
2023-06-14 17:18         ` Stanislav Fomichev
2023-06-16  5:46   ` Kui-Feng Lee
2023-06-16 17:32     ` Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 4/7] bpf: implement devtx timestamp kfunc Stanislav Fomichev
2023-06-13 15:14   ` Simon Horman
2023-06-13 18:39     ` Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 5/7] net: veth: implement devtx timestamp kfuncs Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 6/7] selftests/bpf: extend xdp_metadata with devtx kfuncs Stanislav Fomichev
2023-06-13 14:47   ` Willem de Bruijn
2023-06-13 19:00     ` Stanislav Fomichev
2023-06-12 17:23 ` [RFC bpf-next 7/7] selftests/bpf: extend xdp_hw_metadata " Stanislav Fomichev
2023-06-13 15:03   ` Willem de Bruijn
2023-06-13 19:00     ` Stanislav Fomichev
2023-06-12 21:00 ` [RFC bpf-next 0/7] bpf: netdev TX metadata Toke Høiland-Jørgensen
2023-06-13 16:32   ` Stanislav Fomichev
2023-06-13 17:18     ` Toke Høiland-Jørgensen
2023-06-13 18:39       ` Stanislav Fomichev
2023-06-13 19:10         ` Toke Høiland-Jørgensen
2023-06-13 21:17           ` Stanislav Fomichev
2023-06-13 22:32             ` Alexei Starovoitov
2023-06-13 23:16               ` Stanislav Fomichev
2023-06-14  4:19                 ` Alexei Starovoitov
2023-06-14 11:59                   ` Toke Høiland-Jørgensen [this message]
2023-06-14 16:27                     ` Alexei Starovoitov
2023-06-15 12:36                       ` Toke Høiland-Jørgensen
2023-06-15 16:10                         ` Alexei Starovoitov
2023-06-15 16:31                           ` Stanislav Fomichev
2023-06-16  1:50                             ` Jakub Kicinski
2023-06-16  0:09   ` Stanislav Fomichev
2023-06-16  8:12     ` Magnus Karlsson
2023-06-16 17:32       ` Stanislav Fomichev
2023-06-16 23:10         ` Stanislav Fomichev
2023-06-19  7:15           ` Magnus Karlsson
2023-06-14  3:31 ` Jakub Kicinski
2023-06-14  3:54   ` David Ahern
2023-06-14  5:05     ` Jakub Kicinski
2023-06-14 17:17       ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877cs6l0ea.fsf@toke.dk \
    --to=toke@kernel.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@kernel.org \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.