public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: "Daniel Borkmann" <borkmann@iogearbox.net>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Eric Dumazet" <eric.dumazet@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Paolo Abeni" <pabeni@redhat.com>,
	sdf@fomichev.me, kernel-team@cloudflare.com,
	arthur@arthurfabre.com, jakub@cloudflare.com,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	arzeznik@cloudflare.com, "Yan Zhai" <yan@cloudflare.com>
Subject: Re: [PATCH bpf-next V1 7/7] net: xdp: update documentation for xdp-rx-metadata.rst
Date: Mon, 16 Jun 2025 08:34:31 -0700	[thread overview]
Message-ID: <aFA5hxzOkxVMB_eZ@mini-arch> (raw)
In-Reply-To: <ca38f2ed-999f-4ce1-8035-8ee9247f27f2@kernel.org>

On 06/13, Jesper Dangaard Brouer wrote:
> 
> 
> 
> On 11/06/2025 05.40, Stanislav Fomichev wrote:
> > On 06/11, Lorenzo Bianconi wrote:
> > > > Daniel Borkmann <daniel@iogearbox.net> writes:
> > > > 
> > > [...]
> > > > > > 
> > > > > > Why not have a new flag for bpf_redirect that transparently stores all
> > > > > > available metadata? If you care only about the redirect -> skb case.
> > > > > > Might give us more wiggle room in the future to make it work with
> > > > > > traits.
> > > > > 
> > > > > Also q from my side: If I understand the proposal correctly, in order to fully
> > > > > populate an skb at some point, you have to call all the bpf_xdp_metadata_* kfuncs
> > > > > to collect the data from the driver descriptors (indirect call), and then yet
> > > > > again all equivalent bpf_xdp_store_rx_* kfuncs to re-store the data in struct
> > > > > xdp_rx_meta again. This seems rather costly and once you add more kfuncs with
> > > > > meta data aren't you better off switching to tc(x) directly so the driver can
> > > > > do all this natively? :/
> > > > 
> > > > I agree that the "one kfunc per metadata item" scales poorly. IIRC, the
> > > > hope was (back when we added the initial HW metadata support) that we
> > > > would be able to inline them to avoid the function call overhead.
> > > > 
> > > > That being said, even with half a dozen function calls, that's still a
> > > > lot less overhead from going all the way to TC(x). The goal of the use
> > > > case here is to do as little work as possible on the CPU that initially
> > > > receives the packet, instead moving the network stack processing (and
> > > > skb allocation) to a different CPU with cpumap.
> > > > 
> > > > So even if the *total* amount of work being done is a bit higher because
> > > > of the kfunc overhead, that can still be beneficial because it's split
> > > > between two (or more) CPUs.
> > > > 
> > > > I'm sure Jesper has some concrete benchmarks for this lying around
> > > > somewhere, hopefully he can share those :)
> > > 
> > > Another possible approach would be to have some utility functions (not kfuncs)
> > > used to 'store' the hw metadata in the xdp_frame that are executed in each
> > > driver codebase before performing XDP_REDIRECT. The downside of this approach
> > > is we need to parse the hw metadata twice if the eBPF program that is bounded
> > > to the NIC is consuming these info. What do you think?
> > 
> > That's the option I was asking about. I'm assuming we should be able
> > to reuse existing xmo metadata callbacks for this. We should be able
> > to hide it from the drivers also hopefully.
> 
> I'm not against this idea of transparently stores all available metadata
> into the xdp_frame (via some flag/config), but it does not fit our
> production use-case.  I also think that this can be added later.
> 
> We need the ability to overwrite the RX-hash value, before redirecting
> packet to CPUMAP (remember as cover-letter describe RX-hash needed
> *before* the GRO engine processes the packet in CPUMAP. This is before
> TC/BPF).

Make sense. Can we make GRO not flush a bucket for same_flow=0 instead?
This will also make it work better for other regular tunneled traffic.
Setting hash in BPF to make GRO go fast seems too implementation specific :-(

  reply	other threads:[~2025-06-16 15:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03 17:45 [PATCH bpf-next V1 0/7] xdp: Propagate RX HW hints for XDP_REDIRECTed packets via xdp_frame Jesper Dangaard Brouer
2025-06-03 17:45 ` [PATCH bpf-next V1 1/7] net: xdp: Add xdp_rx_meta structure Jesper Dangaard Brouer
2025-06-03 17:46 ` [PATCH bpf-next V1 2/7] selftests/bpf: Adjust test for maximum packet size in xdp_do_redirect Jesper Dangaard Brouer
2025-06-03 17:46 ` [PATCH bpf-next V1 3/7] net: xdp: Add kfuncs to store hw metadata in xdp_buff Jesper Dangaard Brouer
2025-06-16 21:55   ` Jakub Kicinski
2025-06-03 17:46 ` [PATCH bpf-next V1 4/7] net: xdp: Set skb hw metadata from xdp_frame Jesper Dangaard Brouer
2025-06-03 17:46 ` [PATCH bpf-next V1 5/7] net: veth: Read xdp metadata from rx_meta struct if available Jesper Dangaard Brouer
2025-06-03 17:46 ` [PATCH bpf-next V1 6/7] bpf: selftests: Add rx_meta store kfuncs selftest Jesper Dangaard Brouer
2025-06-06 21:57   ` Alexei Starovoitov
2025-06-06 22:16     ` Lorenzo Bianconi
2025-06-03 17:46 ` [PATCH bpf-next V1 7/7] net: xdp: update documentation for xdp-rx-metadata.rst Jesper Dangaard Brouer
2025-06-06  2:45   ` Stanislav Fomichev
2025-06-10 13:48     ` Daniel Borkmann
2025-06-10 20:12       ` Toke Høiland-Jørgensen
2025-06-10 22:26         ` Lorenzo Bianconi
2025-06-11  3:40           ` Stanislav Fomichev
2025-06-13 10:59             ` Jesper Dangaard Brouer
2025-06-16 15:34               ` Stanislav Fomichev [this message]
2025-06-17 16:15                 ` Jesper Dangaard Brouer
2025-06-17 20:47                   ` Stanislav Fomichev
2025-06-16 12:37             ` Lorenzo Bianconi
2025-06-16 15:45               ` Stanislav Fomichev
2025-06-16 16:40                 ` Lorenzo Bianconi
2025-06-17 11:50                   ` Toke Høiland-Jørgensen
2025-06-17 14:47                     ` Jesper Dangaard Brouer
2025-06-17 15:10                       ` Performance impact of disabling VLAN offload [was: Re: [PATCH bpf-next V1 7/7] net: xdp: update documentation for xdp-rx-metadata.rst] Toke Høiland-Jørgensen
2025-06-19 12:09                         ` Jesper Dangaard Brouer
2025-06-19 12:23                           ` Toke Høiland-Jørgensen
2025-06-13 11:18         ` [PATCH bpf-next V1 7/7] net: xdp: update documentation for xdp-rx-metadata.rst Daniel Borkmann
2025-06-16 11:51           ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFA5hxzOkxVMB_eZ@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=arthur@arthurfabre.com \
    --cc=arzeznik@cloudflare.com \
    --cc=ast@kernel.org \
    --cc=borkmann@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hawk@kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=toke@redhat.com \
    --cc=yan@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox