From: "Arthur Fabre" <afabre@cloudflare.com>
To: "Lorenzo Bianconi" <lorenzo@kernel.org>,
"Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: "Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Jakub Sitnicki" <jakub@cloudflare.com>,
"Alexander Lobakin" <aleksander.lobakin@intel.com>,
<bpf@vger.kernel.org>, <netdev@vger.kernel.org>, <ast@kernel.org>,
<daniel@iogearbox.net>, <davem@davemloft.net>, <kuba@kernel.org>,
<john.fastabend@gmail.com>, <edumazet@google.com>,
<pabeni@redhat.com>, <sdf@fomichev.me>, <tariqt@nvidia.com>,
<saeedm@nvidia.com>, <anthony.l.nguyen@intel.com>,
<przemyslaw.kitszel@intel.com>,
<intel-wired-lan@lists.osuosl.org>, <mst@redhat.com>,
<jasowang@redhat.com>, <mcoquelin.stm32@gmail.com>,
<alexandre.torgue@foss.st.com>,
"kernel-team" <kernel-team@cloudflare.com>,
"Yan Zhai" <yan@cloudflare.com>
Subject: Re: [RFC bpf-next 0/4] Add XDP rx hw hints support performing XDP_REDIRECT
Date: Tue, 01 Oct 2024 16:16:02 +0200 [thread overview]
Message-ID: <D4KJ7DUXJQC5.2UFST9L3CUOH7@bobby> (raw)
In-Reply-To: <ZvqQOpqnK9hBmXNn@lore-desk>
On Mon Sep 30, 2024 at 1:49 PM CEST, Lorenzo Bianconi wrote:
> > Lorenzo Bianconi <lorenzo@kernel.org> writes:
> >
> > >> > We could combine such a registration API with your header format, so
> > >> > that the registration just becomes a way of allocating one of the keys
> > >> > from 0-63 (and the registry just becomes a global copy of the header).
> > >> > This would basically amount to moving the "service config file" into the
> > >> > kernel, since that seems to be the only common denominator we can rely
> > >> > on between BPF applications (as all attempts to write a common daemon
> > >> > for BPF management have shown).
> > >>
> > >> That sounds reasonable. And I guess we'd have set() check the global
> > >> registry to enforce that the key has been registered beforehand?
> > >>
> > >> >
> > >> > -Toke
> > >>
> > >> Thanks for all the feedback!
> > >
> > > I like this 'fast' KV approach but I guess we should really evaluate its
> > > impact on performances (especially for xdp) since, based on the kfunc calls
> > > order in the ebpf program, we can have one or multiple memmove/memcpy for
> > > each packet, right?
> >
> > Yes, with Arthur's scheme, performance will be ordering dependent. Using
> > a global registry for offsets would sidestep this, but have the
> > synchronisation issues we discussed up-thread. So on balance, I think
> > the memmove() suggestion will probably lead to the least pain.
> >
> > For the HW metadata we could sidestep this by always having a fixed
> > struct for it (but using the same set/get() API with reserved keys). The
> > only drawback of doing that is that we statically reserve a bit of
> > space, but I'm not sure that is such a big issue in practice (at least
> > not until this becomes to popular that the space starts to be contended;
> > but surely 256 bytes ought to be enough for everybody, right? :)).
>
> I am fine with the proposed approach, but I think we need to verify what is the
> impact on performances (in the worst case??)
If drivers are responsible for populating the hardware metadata before
XDP, we could make sure drivers set the fields in order to avoid any
memove() (and maybe even provide a helper to ensure this?).
> > > Moreover, I still think the metadata area in the xdp_frame/xdp_buff is not
> > > so suitable for nic hw metadata since:
> > > - it grows backward
> > > - it is probably in a different cacheline with respect to xdp_frame
> > > - nic hw metadata will not start at fixed and immutable address, but it depends
> > > on the running ebpf program
> > >
> > > What about having something like:
> > > - fixed hw nic metadata: just after xdp_frame struct (or if you want at the end
> > > of the metadata area :)). Here he can reuse the same KV approach if it is fast
> > > - user defined metadata: in the metadata area of the xdp_frame/xdp_buff
> >
> > AFAIU, none of this will live in the (current) XDP metadata area. It
> > will all live just after the xdp_frame struct (so sharing the space with
> > the metadata area in the sense that adding more metadata kv fields will
> > decrease the amount of space that is usable by the current XDP metadata
> > APIs).
> >
> > -Toke
> >
>
> ah, ok. I was thinking the proposed approach was to put them in the current
> metadata field.
I've also been thinking of putting this new KV stuff at the start of the
headroom (I think that's what you're saying Toke?). It has a few nice
advantanges:
* It coexists nicely with the current XDP / TC metadata support.
Those users won't be able to overwrite / corrupt the KV metadata.
KV users won't need to call xdp_adjust_meta() (which would be awkward -
how would they know how much space the KV implementation needs).
* We don't have to move all the metadata everytime we call
xdp_adjust_head() (or the kernel equivalent).
Are there any performance implications of that, e.g. for caching?
This would also grow "upwards" which is more natural, but I think
either way the KV API would hide whether it's downwards or upwards from
users.
>
> Regards,
> Lorenzo
next prev parent reply other threads:[~2024-10-01 14:16 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-21 16:52 [RFC bpf-next 0/4] Add XDP rx hw hints support performing XDP_REDIRECT Lorenzo Bianconi
2024-09-21 16:52 ` [RFC bpf-next 1/4] net: xdp: Add xdp_rx_meta structure Lorenzo Bianconi
2024-09-21 16:52 ` [RFC bpf-next 2/4] net: xdp: Update rx_hash of xdp_rx_meta struct running xmo_rx_hash callback Lorenzo Bianconi
2024-09-21 16:52 ` [RFC bpf-next 3/4] net: xdp: Update rx_vlan of xdp_rx_meta struct running xmo_rx_vlan_tag callback Lorenzo Bianconi
2024-09-21 16:53 ` [RFC bpf-next 4/4] net: xdp: Update rx timestamp of xdp_rx_meta struct running xmo_rx_timestamp callback Lorenzo Bianconi
2024-09-21 20:17 ` [RFC bpf-next 0/4] Add XDP rx hw hints support performing XDP_REDIRECT Alexander Lobakin
2024-09-21 21:36 ` Jesper Dangaard Brouer
2024-09-22 9:17 ` Lorenzo Bianconi
2024-09-22 11:12 ` Toke Høiland-Jørgensen
2024-09-22 15:40 ` Lorenzo Bianconi
2024-09-26 10:54 ` Toke Høiland-Jørgensen
2024-09-26 14:57 ` Lorenzo Bianconi
2024-09-27 1:43 ` Stanislav Fomichev
2024-09-26 11:31 ` Arthur Fabre
2024-09-26 12:41 ` Toke Høiland-Jørgensen
2024-09-26 15:44 ` Arthur Fabre
2024-09-27 10:24 ` Toke Høiland-Jørgensen
2024-09-27 14:46 ` Arthur Fabre
2024-09-27 15:06 ` Lorenzo Bianconi
2024-09-30 10:58 ` Toke Høiland-Jørgensen
2024-09-30 11:49 ` Lorenzo Bianconi
2024-10-01 14:16 ` Arthur Fabre [this message]
2024-10-01 14:54 ` Lorenzo Bianconi
2024-10-01 15:14 ` Toke Høiland-Jørgensen
2024-10-02 17:02 ` Stanislav Fomichev
2024-10-02 18:38 ` Toke Høiland-Jørgensen
2024-10-02 22:49 ` Stanislav Fomichev
2024-10-03 6:35 ` Arthur Fabre
2024-10-03 20:26 ` Stanislav Fomichev
2024-10-04 2:13 ` Daniel Xu
2024-10-04 10:38 ` Jesper Dangaard Brouer
2024-10-04 13:55 ` Arthur Fabre
2024-10-04 14:14 ` Jesper Dangaard Brouer
2024-10-04 14:18 ` Lorenzo Bianconi
2024-10-04 14:29 ` Arthur Fabre
2024-10-04 17:53 ` Stanislav Fomichev
2024-10-06 10:27 ` Toke Høiland-Jørgensen
2024-10-07 18:48 ` Stanislav Fomichev
2024-10-08 7:15 ` Arthur Fabre
2024-10-04 16:27 ` Stanislav Fomichev
2024-09-30 10:52 ` Toke Høiland-Jørgensen
2024-10-01 14:06 ` Arthur Fabre
2024-10-01 15:28 ` Toke Høiland-Jørgensen
2024-10-03 6:51 ` Arthur Fabre
2024-09-22 9:08 ` Lorenzo Bianconi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D4KJ7DUXJQC5.2UFST9L3CUOH7@bobby \
--to=afabre@cloudflare.com \
--cc=aleksander.lobakin@intel.com \
--cc=alexandre.torgue@foss.st.com \
--cc=anthony.l.nguyen@intel.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jakub@cloudflare.com \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@cloudflare.com \
--cc=kuba@kernel.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
--cc=mcoquelin.stm32@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=saeedm@nvidia.com \
--cc=sdf@fomichev.me \
--cc=tariqt@nvidia.com \
--cc=toke@redhat.com \
--cc=yan@cloudflare.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox