bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: BPF-dev-list <bpf@vger.kernel.org>,
	"Alexander Lobakin" <alexandr.lobakin@intel.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Magnus Karlsson" <magnus.karlsson@gmail.com>,
	"David Ahern" <dsahern@kernel.org>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Saeed Mahameed" <saeed@kernel.org>,
	"kurt@linutronix.de" <kurt@linutronix.de>,
	"Raczynski, Piotr" <piotr.raczynski@intel.com>,
	"Zhang, Jessica" <jessica.zhang@intel.com>,
	"Maloor, Kishen" <kishen.maloor@intel.com>,
	"Gomes, Vinicius" <vinicius.gomes@intel.com>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	"Swiatkowski, Michal" <michal.swiatkowski@intel.com>,
	"Plantykow, Marta A" <marta.a.plantykow@intel.com>,
	"Desouza, Ederson" <ederson.desouza@intel.com>,
	"Song, Yoong Siang" <yoong.siang.song@intel.com>,
	"Czapnik, Lukasz" <lukasz.czapnik@intel.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Joseph, Jithu" <jithu.joseph@intel.com>,
	"William Tu" <u9012063@gmail.com>,
	"Ong Boon Leong" <boon.leong.ong@intel.com>,
	brouer@redhat.com,
	"Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>
Subject: Re: XDP-hints: Howto support multiple BTF types per packet basis?
Date: Wed, 26 May 2021 22:20:23 +0200	[thread overview]
Message-ID: <20210526222023.44f9b3c6@carbon> (raw)
In-Reply-To: <CAEf4BzYXUDyQaBjZmb_Q5-z3jw1-Uvdgxm+cfcQjSwb9oRoXnQ@mail.gmail.com>

On Wed, 26 May 2021 12:12:09 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, May 26, 2021 at 3:59 AM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > Hi All,
> >
> > I see a need for a driver to use different XDP metadata layout on a per
> > packet basis. E.g. PTP packets contains a hardware timestamp. E.g. VLAN
> > offloading and associated metadata as only relevant for packets using
> > VLANs. (Reserving room for every possible HW-hint is against the idea
> > of BTF).
> >
> > The question is how to support multiple BTF types on per packet basis?
> > (I need input from BTF experts, to tell me if I'm going in the wrong
> > direction with below ideas).  
> 
> I'm trying to follow all three threads, but still, can someone dumb it
> down for me and use few very specific examples to show how all this is
> supposed to work end-to-end. I.e., how the C definition for those
> custom BTF layouts might look like and how they are used in BPF
> programs, etc. I'm struggling to put all the pieces together, even
> ignoring all the netdev-specific configuration questions.

I admit that this thread is pushing the boundaries and "ask" too much.
I think we need some steps in-between to get the ball rolling first.  I
myself need to learn more of what is possible today with BTF, before I
ask for more features and multiple simultaneous BTF IDs.

I will go read Andrii's excellent docs [1]+[2] *again*, and perhaps[3].
Do you recommend other BTF docs?
 
 [1] https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html
 [2] https://nakryiko.com/posts/bpf-portability-and-co-re/
 [3] https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html 

> As for BTF on a per-packet basis. This means that BTF itself is not
> known at the BPF program verification time, so there will be some sort
> of if/else if/else conditions to handle all recognized BTF IDs? Is
> that right? 

I do want libbpf CO-RE and BPF program verification to work.  I'm
asking for a BPF-program that can supply multiple BTF struct layouts
and get all of them CO-RE offset adjusted.

The XDP/BPF-prog itself have if/else conditions on BPF-IDs to handle
all the BPF IDs it knows.  When loading the BPF-prog the offset
relocation are done for the code (as usual I presume).

Maybe it is worth pointing out, that the reason for requiring the
BPF-prog to check the BPF-ID match, is to solve the netdev HW feature
update problem.  I'm basically evil and say we can update the netdev HW
features anytime, because it is the BPF programmers responsibility to
check if BTF info changed (after prog was loaded). (The BPF programmer
can solve this via requesting all the possible BTF IDs the driver can
change between, or choose she is only interested in a single variant).

By this, I'm trying to avoid loading an XDP-prog locks down what
hardware features can be enabled/disabled.  It would be sad running
tcpdump (-j adapter_unsynced) that request for HW RX-timestamp is
blocked due to XDP being loaded.


> Fake but specific code would help (at least me) to actually join the
> discussion. Thanks.

I agree, I actually want to code-up a simple example that use BTF CO-RE
and then try to follow the libbpf code that adjust the offsets.  I
admit I need to understand BTF better myself, before I ask for
new/advanced features ;-)

Thanks Andrii for giving us feedback, I do need to learn more about BTF
myself to join the discussion myself.


> >
> > Let me describe a possible/proposed packet flow (feel free to
> > disagree):
> >
> >  When driver RX e.g. a PTP packet it knows HW is configured for
> > PTP-TS and when it sees a TS is available, then it chooses a code
> > path that use the BTF layout that contains RX-TS. To communicate
> > what BTF-type the XDP-metadata contains, it simply store the BTF-ID
> > in xdp_buff->btf_id.
> >
> >  When redirecting the xdp_buff is converted to xdp_frame, and also
> > contains the btf_id member. When converting xdp_frame to SKB, then
> > netcore-code checks if this BTF-ID have been registered, if so
> > there is a (callback or BPF-hook) registered to handle this
> > BTF-type that transfer the fields from XDP-metadata area into SKB
> > fields.
> >
> >  The XDP-prog also have access to this ctx->btf_id and can
> > multiplex on this in the BPF-code itself. Or use other methods like
> > parsing PTP packet and extract TS as expected BTF offset in XDP
> > metadata (perhaps add a sanity check if metadata-size match).
> >
> >
> > I talked to AF_XDP people (Magnus, Bjørn and William) about this
> > idea, and they pointed out that AF_XDP also need to know what
> > BTF-layout is used. As Magnus wrote in other thread; there is only
> > 32-bit left in AF_XDP descriptor option. We could store the BTF-ID
> > in this field, but it would block for other use-cases. Bjørn came
> > up with the idea of storing the BTF-ID in the BTF-layout itself,
> > but as the last-member (to have fixed offset to check in userspace
> > AF_XDP program). Then we only need to use a single bit in AF_XDP
> > descriptor option to say XDP-metadata is BTF described.
> >
> > In the AF_XDP userspace program, the programmers can have a similar
> > callback system per known BTF-ID. This way they can compile
> > efficient code per ID via requesting the BTF layout from the
> > kernel. (Hint: `bpftool btf dump id 42 format c`).
> >
> > Please let me know if this it the right or wrong direction?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2021-05-26 20:20 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 10:58 XDP-hints: Howto support multiple BTF types per packet basis? Jesper Dangaard Brouer
2021-05-26 19:12 ` Andrii Nakryiko
2021-05-26 20:20   ` Jesper Dangaard Brouer [this message]
2021-05-26 20:37     ` John Fastabend
2021-05-26 22:39     ` Andrii Nakryiko
2021-05-28 11:16       ` Jesper Dangaard Brouer
2021-05-30  3:24         ` Andrii Nakryiko
2021-05-26 20:31   ` John Fastabend
2021-05-26 22:54     ` Andrii Nakryiko
2021-05-27  0:44       ` John Fastabend
2021-05-27 17:44         ` Andrii Nakryiko
2021-05-28  5:48           ` John Fastabend
2021-05-28  9:16             ` Toke Høiland-Jørgensen
2021-05-28 10:38               ` Alexander Lobakin
2021-05-28 14:35               ` John Fastabend
2021-05-28 15:33                 ` Toke Høiland-Jørgensen
2021-05-28 16:02                 ` Jesper Dangaard Brouer
2021-05-28 17:29                   ` John Fastabend
2021-05-30  3:27                     ` Andrii Nakryiko
2021-05-31 11:03                     ` Toke Høiland-Jørgensen
2021-05-31 13:17                       ` Jesper Dangaard Brouer
2021-06-02  0:22                       ` John Fastabend
2021-06-02 16:18                         ` Jakub Kicinski
2021-06-22  7:44                           ` Michal Swiatkowski
2021-06-22 11:53                             ` Toke Høiland-Jørgensen
2021-06-23  8:32                               ` Michal Swiatkowski
2021-06-24 12:23                                 ` Toke Høiland-Jørgensen
2021-06-24 13:07                                   ` Magnus Karlsson
2021-06-24 14:58                                     ` Alexei Starovoitov
2021-06-24 15:11                                   ` Zvi Effron
2021-06-24 16:04                                     ` Toke Høiland-Jørgensen
2021-06-24 16:32                                       ` Zvi Effron
2021-06-24 16:45                                       ` Jesper Dangaard Brouer
2021-07-08  8:32                                   ` Michal Swiatkowski
2021-07-09 10:57                                     ` Toke Høiland-Jørgensen
2021-09-02  2:49                                       ` Michal Swiatkowski
2021-09-02  9:17                                         ` Jesper Dangaard Brouer
2021-09-07  6:27                                           ` Michal Swiatkowski
2021-09-08 13:28                                             ` Jesper Dangaard Brouer
2021-09-09 18:19                                               ` Andrii Nakryiko
2021-09-10 11:16                                                 ` Jesper Dangaard Brouer
2021-07-28  9:54 ` XDP-hints: how to inform driver about hints Michal Swiatkowski
2021-07-28 18:40   ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210526222023.44f9b3c6@carbon \
    --to=brouer@redhat.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bjorn@kernel.org \
    --cc=boon.leong.ong@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=dsahern@kernel.org \
    --cc=ederson.desouza@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jessica.zhang@intel.com \
    --cc=jithu.joseph@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kishen.maloor@intel.com \
    --cc=kurt@linutronix.de \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lukasz.czapnik@intel.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=marta.a.plantykow@intel.com \
    --cc=michal.swiatkowski@intel.com \
    --cc=piotr.raczynski@intel.com \
    --cc=saeed@kernel.org \
    --cc=u9012063@gmail.com \
    --cc=vinicius.gomes@intel.com \
    --cc=yoong.siang.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).