From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Yan Zhai <yan@cloudflare.com>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Willem de Bruijn" <willemb@google.com>,
"Simon Horman" <horms@kernel.org>,
"Florian Westphal" <fw@strlen.de>,
"Mina Almasry" <almasrymina@google.com>,
"Abhishek Chauhan" <quic_abchauha@quicinc.com>,
"David Howells" <dhowells@redhat.com>,
"Alexander Lobakin" <aleksander.lobakin@intel.com>,
"David Ahern" <dsahern@kernel.org>,
"Richard Gobert" <richardbgobert@gmail.com>,
"Antoine Tenart" <atenart@kernel.org>,
"Felix Fietkau" <nbd@nbd.name>,
"Soheil Hassas Yeganeh" <soheil@google.com>,
"Pavel Begunkov" <asml.silence@gmail.com>,
"Lorenzo Bianconi" <lorenzo@kernel.org>,
"Thomas Weißschuh" <linux@weissschuh.net>,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [RFC net-next 1/9] skb: introduce gro_disabled bit
Date: Sun, 30 Jun 2024 09:40:17 -0400 [thread overview]
Message-ID: <668160415228c_c6202948c@willemb.c.googlers.com.notmuch> (raw)
In-Reply-To: <CAO3-PbrKRqeA4bCPnv7xkDiUFtuCMfzYZiEur3wM=+x8nc2xpQ@mail.gmail.com>
Yan Zhai wrote:
> On Sun, Jun 23, 2024 at 3:27 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Yan Zhai wrote:
> > > > > -static inline bool netif_elide_gro(const struct net_device *dev)
> > > > > +static inline bool netif_elide_gro(const struct sk_buff *skb)
> > > > > {
> > > > > - if (!(dev->features & NETIF_F_GRO) || dev->xdp_prog)
> > > > > + if (!(skb->dev->features & NETIF_F_GRO) || skb->dev->xdp_prog)
> > > > > return true;
> > > > > +
> > > > > +#ifdef CONFIG_SKB_GRO_CONTROL
> > > > > + return skb->gro_disabled;
> > > > > +#else
> > > > > return false;
> > > > > +#endif
> > > >
> > > > Yet more branches in the hot path.
> > > >
> > > > Compile time configurability does not help, as that will be
> > > > enabled by distros.
> > > >
> > > > For a fairly niche use case. Where functionality of GRO already
> > > > works. So just a performance for a very rare case at the cost of a
> > > > regression in the common case. A small regression perhaps, but death
> > > > by a thousand cuts.
> > > >
> > >
> > > I share your concern on operating on this hotpath. Will a
> > > static_branch + sysctl make it less aggressive?
> >
> > That is always a possibility. But we have to use it judiciously,
> > cannot add a sysctl for every branch.
> >
> > I'm still of the opinion that Paolo shared that this seems a lot of
> > complexity for a fairly minor performance optimization for a rare
> > case.
> >
> Actually combining the discussion in this thread, I think it would be
> more than the corner cases that we encounter. Let me elaborate below.
>
> > > Speaking of
> > > performance, I'd hope this can give us more control so we can achieve
> > > the best of two worlds: for TCP and some UDP traffic, we can enable
> > > GRO, while for some other classes that we know GRO does no good or
> > > even harm, let's disable GRO to save more cycles. The key observation
> > > is that developers may already know which traffic is blessed by GRO,
> > > but lack a way to realize it.
> >
> > Following up also on Daniel's point on using BPF as GRO engine. Even
> > earlier I tried to add an option to selectively enable GRO protocols
> > without BPF. Definitely worthwhile to be able to disable GRO handlers
> > to reduce attack surface to bad input.
> >
> I was probably staring too hard at my own things, which is indeed a
> corner case. But reducing the attack surface is indeed a good
> motivation for this patch. I checked briefly with our DoS team today,
> the DoS scenario will definitely benefit from skipping GRO, for
> example on SYN/RST floods. XDP is our main weapon to drop attack
> traffic today, but it does not always drop 100% of the floods, and
> time by time it does need to fall back to iptables due to the delay of
> XDP program assembly or the BPF limitation on analyzing the packet. I
> did an ad hoc measurement just now on a mostly idle server, with
> ~1.3Mpps SYN flood concentrated on one CPU and dropped them early in
> raw-PREROUTING. w/ GRO this would consume about 35-41% of the CPU
> time, while w/o GRO the time dropped to 9-12%. This seems a pretty
> significant breath room under heavy attacks.
A GRO opt-out might make sense.
A long time ago I sent a patch that configured GRO protocols using
syscalls, selectively (un)registering handlers. The interface was not
very nice, so I did not pursue it further. On the upside, the datapath
did not introduce any extra code. The intent was to reduce attack
surface of packet parsing code.
A few concerns with an XDP based opt-out. It is more work to enable:
requires compiling and load an XDP program. It adds cycles in the
hot path. And I do not entirely understand when an XDP program will be
able to detect that a packet should not enter the GRO engine, but
cannot drop the packet (your netfilter example above).
> But I am not sure I understand "BPF as GRO engine" here, it seems to
> me that being able to disable GRO by XDP is already good enough. Any
> more motivations to do more complex work here?
FWIW, we looked into this a few years ago. Analogous to the BPF flow
dissector: if the BPF program is loaded, use that instead of the C
code path. But we did not arrive at a practical implementation at the
time. Things may have changed, but one issue is how to store and
access the list (or table) of outstanding GRO skbs.
> best
> Yan
>
> >
> > >
> > > best
> > > Yan
> >
> >
next prev parent reply other threads:[~2024-06-30 13:40 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-20 22:19 [RFC net-next 0/9] xdp: allow disable GRO per packet by XDP Yan Zhai
2024-06-20 22:19 ` [RFC net-next 1/9] skb: introduce gro_disabled bit Yan Zhai
2024-06-21 9:11 ` Alexander Lobakin
2024-06-21 15:40 ` Yan Zhai
2024-06-21 9:49 ` Paolo Abeni
2024-06-21 14:29 ` Yan Zhai
2024-06-21 9:57 ` Paolo Abeni
2024-06-21 15:17 ` Yan Zhai
2024-06-21 12:15 ` Willem de Bruijn
2024-06-21 12:47 ` Daniel Borkmann
2024-06-21 16:00 ` Yan Zhai
2024-06-21 16:15 ` Daniel Borkmann
2024-06-21 17:20 ` Yan Zhai
2024-06-23 8:23 ` Willem de Bruijn
2024-06-24 13:30 ` Daniel Borkmann
2024-06-24 17:49 ` Yan Zhai
2024-06-21 15:34 ` Yan Zhai
2024-06-23 8:27 ` Willem de Bruijn
2024-06-24 18:17 ` Yan Zhai
2024-06-30 13:40 ` Willem de Bruijn [this message]
2024-07-03 18:46 ` Yan Zhai
2024-06-20 22:19 ` [RFC net-next 2/9] xdp: add XDP_FLAGS_GRO_DISABLED flag Yan Zhai
2024-06-21 9:15 ` Alexander Lobakin
2024-06-21 16:12 ` Yan Zhai
2024-06-20 22:19 ` [RFC net-next 3/9] xdp: implement bpf_xdp_disable_gro kfunc Yan Zhai
2024-06-20 22:19 ` [RFC net-next 4/9] bnxt: apply XDP offloading fixup when building skb Yan Zhai
2024-06-20 22:19 ` [RFC net-next 5/9] ice: " Yan Zhai
2024-06-21 9:20 ` Alexander Lobakin
2024-06-21 16:05 ` Yan Zhai
2024-06-20 22:19 ` [RFC net-next 6/9] veth: " Yan Zhai
2024-06-20 22:19 ` [RFC net-next 7/9] mlx5: move xdp_buff scope one level up Jesper Dangaard Brouer
2024-06-20 22:19 ` [RFC net-next 8/9] mlx5: apply XDP offloading fixup when building skb Yan Zhai
2024-06-20 22:19 ` [RFC net-next 9/9] bpf: selftests: test disabling GRO by XDP Yan Zhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=668160415228c_c6202948c@willemb.c.googlers.com.notmuch \
--to=willemdebruijn.kernel@gmail.com \
--cc=aleksander.lobakin@intel.com \
--cc=almasrymina@google.com \
--cc=asml.silence@gmail.com \
--cc=ast@kernel.org \
--cc=atenart@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@weissschuh.net \
--cc=lorenzo@kernel.org \
--cc=nbd@nbd.name \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=quic_abchauha@quicinc.com \
--cc=richardbgobert@gmail.com \
--cc=soheil@google.com \
--cc=willemb@google.com \
--cc=yan@cloudflare.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).