From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: "Yan Zhai" <yan@cloudflare.com>,
"Stanislav Fomichev" <sdf@google.com>,
Netdev <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
kernel-team <kernel-team@cloudflare.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Eric Dumazet" <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>,
"Jakub Sitnicki" <jakub@cloudflare.com>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Edward Cree" <ecree.xilinx@gmail.com>
Subject: Re: Does skb_metadata_differs really need to stop GRO aggregation?
Date: Thu, 30 Nov 2023 14:00:48 -0800 [thread overview]
Message-ID: <e23e6332-7c09-42b8-83a0-1a8029132ccc@linux.dev> (raw)
In-Reply-To: <e3402045-a36f-461f-8eab-bbc51735492d@kernel.org>
On 11/30/23 12:35 PM, Jesper Dangaard Brouer wrote:
> I should explain our use-case(s) a bit more.
> We do want the information to survive XDP_PASS into the SKB.
> Its the hole point, as we want to transfer information from XDP layer to
> TC-layer and perhaps further all the way to BPF socket filters (I even
> heard someone asked for).
>
> I'm trying to get an overview, as I now have multiple product teams that
> want to store information across/into differ layer, and they have other
> teams that consume this information.
>
> We are exploring more options than only XDP metadata area to store
> information. I have suggested that once an SKB have a socket
> associated, then we can switch into using BPF local socket storage
> tricks. (The lifetime of XDP metadata is not 100% clear as e.g.
> pskb_expand_head clears it via skb_metadata_clear).
> All ideas are welcome, e.g. I'm also looking at ability to store
> auxiliary/metadata data associated with a dst_entry. And SKB->mark is
> already used for other use-cases and isn't big enough. (and then there
> is fun crossing a netns boundry).
>
> Let me explain *one* of the concrete use-cases. As described in [1],
> the CF XDP L4 load-balancer Unimog have been extended to a product
> called Plurimog that does load-balancing across data-centers "colo's".
> When Plurimog redirects to another colo, the original "landing" colo's
> ID is carried across (in some encap header) to a Unimog instance. Thus,
> the original landing Colo ID is known to Unimog running in another colo,
> but that header is popped, so this info need to be transferred somehow.
> I'm told that even the webserver/Nginx need to know the orig/foreign
> landing colo ID (here there should be socket associated). For TCP SYN
> packets, the layered DOS protecting also need to know foreign landing
> colo ID. Other teams/products needs this for accounting, e.g. Traffic
> Manager[1], Radar[2] and Capacity planning.
We also bumped into a usecase about plumbing the RX timestamp taken at XDP to
its final "sk" for analysis purpose. The usecase had not materialized.
fwiw, one of my thoughts at that time is similar to your sk local storage
thinking, do a bpf_sk_lookup_tcp at xdp and store the stats there. It will waste
the lookup effort because there is no skb to do the bpf_sk_assign(). Then follow
this direction of thought is to allocate a skb in the xdp prog itself if we know
it is a XDP_PASS.
That said, the sk storage approach would work fine if whatever it wants to
collect from xdp_md(s)/skb(s) can be stacked/aggregated in a sk. It would be
nicer if the __sk_buff->data_meta can work more like other bpf local storage
(sk, task, cgroup...etc) such that it will be available to other bpf prog type
(e.g. a tracing prog).
next prev parent reply other threads:[~2023-11-30 22:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-28 12:37 Does skb_metadata_differs really need to stop GRO aggregation? Jesper Dangaard Brouer
2023-11-28 13:06 ` Jesper Dangaard Brouer
2023-11-28 13:30 ` Daniel Borkmann
2023-11-28 14:39 ` Toke Høiland-Jørgensen
2023-11-29 18:04 ` Edward Cree
2023-11-29 21:52 ` Toke Høiland-Jørgensen
2023-11-29 23:10 ` Daniel Borkmann
2023-11-30 13:55 ` Toke Høiland-Jørgensen
2023-11-30 16:32 ` Daniel Borkmann
2023-11-30 20:35 ` Jesper Dangaard Brouer
2023-11-30 22:00 ` Martin KaFai Lau [this message]
2023-12-01 6:20 ` Yan Zhai
2023-12-01 17:09 ` Yan Zhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e23e6332-7c09-42b8-83a0-1a8029132ccc@linux.dev \
--to=martin.lau@linux.dev \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=ecree.xilinx@gmail.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=jakub@cloudflare.com \
--cc=kernel-team@cloudflare.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@google.com \
--cc=toke@redhat.com \
--cc=yan@cloudflare.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).