bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	"John Fastabend" <john.fastabend@gmail.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>,
	Saeed Mahameed <saeed@kernel.org>,
	"Raczynski, Piotr" <piotr.raczynski@intel.com>,
	"Zhang, Jessica" <jessica.zhang@intel.com>,
	"Kubiak, Marcin" <marcin.kubiak@intel.com>,
	"Joseph, Jithu" <jithu.joseph@intel.com>,
	"kurt@linutronix.de" <kurt@linutronix.de>,
	"Maloor, Kishen" <kishen.maloor@intel.com>,
	"Gomes, Vinicius" <vinicius.gomes@intel.com>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	"Swiatkowski, Michal" <michal.swiatkowski@intel.com>,
	"Plantykow, Marta A" <marta.a.plantykow@intel.com>,
	"Ong, Boon Leong" <boon.leong.ong@intel.com>,
	"Desouza, Ederson" <ederson.desouza@intel.com>,
	"Song, Yoong Siang" <yoong.siang.song@intel.com>,
	"Czapnik, Lukasz" <lukasz.czapnik@intel.com>,
	bpf@vger.kernel.org, brouer@redhat.com
Subject: Re: AF_XDP metadata/hints
Date: Wed, 26 May 2021 08:35:49 -0700	[thread overview]
Message-ID: <60ae6ad5a2e04_18bf20819@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <87y2c1iqz4.fsf@toke.dk>

Toke Høiland-Jørgensen wrote:
> Jesper Dangaard Brouer <brouer@redhat.com> writes:
> 
> > On Tue, 25 May 2021 21:51:22 -0700
> > John Fastabend <john.fastabend@gmail.com> wrote:
> >
> >> Separate the config of hardware from the BPF infrastructure these
> >> are two separate things.
> >
> > I fully agree.
> 
> +1. Another reason why is the case of multiple XDP programs on a single
> interface: When attaching these (using freplace as libxdp does it), the
> kernel can just check the dest interface when verifying the freplace
> program and any rewriting of the bytecode from the BTF format can happen
> at that point. Whereas if the BPF attach needs to have side effects,
> suddenly we have to copy over all the features to the dispatcher program
> and do some kind of set union operation; and what happens if an freplace
> program is attached after the fact (same thing with tail calls)?
> 
> So in my mind there's no doubt this needs to be:
> 
> driver is config'ed -> it changes its exposed BTF metadata -> program is
> attached -> verifier rewrites program to access metadata correctly

Well likely libbpf would do the rewrite I think.

> 
> > How should we handle existing config interfaces?
> >
> > Let me give some concrete examples. Today there are multiple existing
> > interfaces to enable/disable NIC hardware features that change what is
> > available to put in our BTF-layout.
> >
> > E.g. changing if VLAN is in descriptor:
> >  # ethtool -K ixgbe1 rx-vlan-offload off
> >  # ethtool -k ixgbe1 | grep vlan-offload
> >  rx-vlan-offload: off
> >  tx-vlan-offload: on
> >
> > The timestamping features can be listed by ethtool -T (see below
> > signature), but it is a socket option that enable[1] these
> > (see SO_TIMESTAMPNS or SOF_TIMESTAMPING_RX_HARDWARE).
> >
> > Or tuning RSS hash fields:
> >  [2] https://github.com/stackpath/rxtxcpu/blob/master/Documentation/case-studies/observing-rss-on-ixgbe-advanced-rss-configuration-rss-hash-fields.md
> >
> > I assume we need to stay compatible and respect the existing config
> > interfaces, right?

I'm not convinced its a strict requirement, rather its a nice to
have. These are low level ethtool hooks into the hardware its
fine IMO if the hardware just reports off and uses a more robust
configuration channel. In general we should try to get away from
this model where kernel devs are acting as the gate keepers for
all hardware offloads and we explicit add checkboxs that driver
writers can use. The result is the current state of things where
we have very flexible hardware that are not usable from Linux.

> >
> > Should we simple leverage existing interfaces?
> 
> Now that ethtool has moved to netlink it should be quite
> straight-forward to add a separate subset of commands for configuring
> metadata fields; and internally the kernel can map those to the existing
> config knobs, no?

Its unclear to me how you simple expose knobs to reconfigure hardware.
It looks to me that you need to push a blob down to the hardware to
reconfigure it for new parsers, new actions, etc. But, maybe the
folks working on current hardware can speak up.

> 
> E.g., if you tell the kernel you'd like to have the VLAN field as a
> metadata field that kinda implies that rx-vlan-offload should be turned
> on; etc. Any reason this would break down?
> 
> -Toke
> 

Agree driver should be able to map these back onto 'legacy' feature
sets.

I'll still have a basic question though. I've never invested much time
into the hints because its still not clear to me what the use case is?
What would we put in the hints and do we have any data to show it would be
a performance win.

If its a simple hash of the headers then how would we use it? The
map_lookup/updates use IP addrs for keys in Cilium. So I think the
suggestion is to offload the jhash operation? But that requires some
program changes to work. Could someone convince me?

Maybe packet timestamp?

Thanks,
John

  reply	other threads:[~2021-05-26 15:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <dc2c38cdccfa5eca925cfc9d59b0674e208c9c9d.camel@intel.com>
     [not found] ` <2226aeaab7a4ca8e4f26413514bf54ab2c81ea36.camel@intel.com>
     [not found] ` <DM6PR11MB2780A8C5410ECB3C9700EAB5CA579@DM6PR11MB2780.namprd11.prod.outlook.com>
     [not found]   ` <PH0PR11MB487034313697F395BB5BA3C5E4579@PH0PR11MB4870.namprd11.prod.outlook.com>
     [not found]     ` <DM4PR11MB5422733A87913EFF8904C17184579@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]       ` <20210507131034.5a62ce56@carbon>
     [not found]         ` <DM4PR11MB5422FE9618B3692D48FCE4EA84549@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]           ` <20210510185029.1ca6f872@carbon>
     [not found]             ` <DM4PR11MB54227C25DFD4E882CB03BD3884539@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]               ` <20210512102546.5c098483@carbon>
     [not found]                 ` <DM4PR11MB542273C9D8BF63505DC6E21784519@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                   ` <7b347a985e590e2a422f837971b30bd83f9c7ac3.camel@nvidia.com>
     [not found]                     ` <DM4PR11MB5422762E82C0531B92BDF09A842B9@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                       ` <DM4PR11MB5422269F6113268172B9E26A842A9@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                         ` <DM4PR11MB54224769926B06EE76635A6484299@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                           ` <20210521153110.207cb231@carbon>
     [not found]                             ` <1426b c91c6c6ee3aaf3d85c4291a12968634e521.camel@kernel.org>
     [not found]                               ` <87lf85zmuw.fsf@toke.dk>
2021-05-25 14:20                                 ` AF_XDP metadata/hints Alexander Lobakin
2021-05-26  4:51                                   ` John Fastabend
2021-05-26 11:49                                     ` Jesper Dangaard Brouer
2021-05-26 13:06                                       ` Toke Høiland-Jørgensen
2021-05-26 15:35                                         ` John Fastabend [this message]
2021-05-26 15:41                                           ` John Fastabend
2021-05-26 15:54                                           ` Alexander Lobakin
2021-05-26 16:33                                             ` John Fastabend
2021-05-26 18:44                                               ` Jesper Dangaard Brouer
2021-05-26 16:41                                             ` Alexei Starovoitov
2021-05-26 17:01                                               ` John Fastabend
2021-05-26 17:38                                           ` Jesper Dangaard Brouer
2021-05-26 14:49                                   ` Jesper Dangaard Brouer
2021-06-05  0:32           ` Desouza, Ederson
2021-06-11 19:25             ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60ae6ad5a2e04_18bf20819@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=boon.leong.ong@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=ederson.desouza@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jessica.zhang@intel.com \
    --cc=jithu.joseph@intel.com \
    --cc=kishen.maloor@intel.com \
    --cc=kurt@linutronix.de \
    --cc=lukasz.czapnik@intel.com \
    --cc=marcin.kubiak@intel.com \
    --cc=marta.a.plantykow@intel.com \
    --cc=michal.swiatkowski@intel.com \
    --cc=piotr.raczynski@intel.com \
    --cc=saeed@kernel.org \
    --cc=toke@redhat.com \
    --cc=vinicius.gomes@intel.com \
    --cc=yoong.siang.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).