From: Alexander Lobakin <alexandr.lobakin@intel.com>
To: Richard Gobert <richardbgobert@gmail.com>
Cc: <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
<pabeni@redhat.com>, <yoshfuji@linux-ipv6.org>,
<dsahern@kernel.org>, <steffen.klassert@secunet.com>,
<lixiaoyan@google.com>, <alexanderduyck@fb.com>,
<leon@kernel.org>, <ye.xingchen@zte.com.cn>,
<iwienand@redhat.com>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] gro: optimise redundant parsing of packets
Date: Mon, 30 Jan 2023 16:40:48 +0100 [thread overview]
Message-ID: <836bc0f5-004b-3b7d-d0d7-b70b030977c6@intel.com> (raw)
In-Reply-To: <20230130130752.GA8015@debian>
From: Richard Gobert <richardbgobert@gmail.com>
Date: Mon, 30 Jan 2023 14:07:55 +0100
> Currently, the IPv6 extension headers are parsed twice: first in
> ipv6_gro_receive, and then again in ipv6_gro_complete.
>
> The field NAPI_GRO_CB(skb)->proto is used by GRO to hold the layer 4
> protocol type that comes after the IPv6 layer. I noticed that it is set
> in ipv6_gro_receive, but isn't used anywhere. By using this field, and
> also storing the size of the network header, we can avoid parsing
> extension headers a second time in ipv6_gro_complete.
>
> The implementation had to handle both inner and outer layers in case of
> encapsulation (as they can't use the same field).
>
> I've applied this optimisation to all base protocols (IPv6, IPv4,
> Ethernet). Then, I benchmarked this patch on my machine, using ftrace to
> measure ipv6_gro_complete's performance, and there was an improvement.
Would be nice to see some perf numbers. "there was an improvement"
doesn't say a lot TBH...
>
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
> include/net/gro.h | 8 ++++++--
> net/ethernet/eth.c | 11 +++++++++--
> net/ipv4/af_inet.c | 8 +++++++-
> net/ipv6/ip6_offload.c | 15 ++++++++++++---
> 4 files changed, 34 insertions(+), 8 deletions(-)
[...]
> @@ -456,12 +459,16 @@ EXPORT_SYMBOL(eth_gro_receive);
> int eth_gro_complete(struct sk_buff *skb, int nhoff)
> {
> struct ethhdr *eh = (struct ethhdr *)(skb->data + nhoff);
> - __be16 type = eh->h_proto;
> + __be16 type;
Please don't break RCT style when shortening/expanding variable
declaration lines.
> struct packet_offload *ptype;
> int err = -ENOSYS;
>
> - if (skb->encapsulation)
> + if (skb->encapsulation) {
> skb_set_inner_mac_header(skb, nhoff);
> + type = eh->h_proto;
> + } else {
> + type = NAPI_GRO_CB(skb)->network_proto;
> + }
>
> ptype = gro_find_complete_by_type(type);
> if (ptype != NULL)
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 6c0ec2789943..4401af7b3a15 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1551,6 +1551,9 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> * immediately following this IP hdr.
> */
>
> + if (!NAPI_GRO_CB(skb)->encap_mark)
> + NAPI_GRO_CB(skb)->transport_proto = proto;
> +
> /* Note : No need to call skb_gro_postpull_rcsum() here,
> * as we already checked checksum over ipv4 header was 0
> */
> @@ -1621,12 +1624,15 @@ int inet_gro_complete(struct sk_buff *skb, int nhoff)
> __be16 newlen = htons(skb->len - nhoff);
> struct iphdr *iph = (struct iphdr *)(skb->data + nhoff);
> const struct net_offload *ops;
> - int proto = iph->protocol;
> + int proto;
(same)
> int err = -ENOSYS;
>
> if (skb->encapsulation) {
> skb_set_inner_protocol(skb, cpu_to_be16(ETH_P_IP));
> skb_set_inner_network_header(skb, nhoff);
> + proto = iph->protocol;
> + } else {
> + proto = NAPI_GRO_CB(skb)->transport_proto;
> }
>
> csum_replace2(&iph->check, iph->tot_len, newlen);
[...]
> @@ -358,7 +361,13 @@ INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff)
> iph->payload_len = htons(payload_len);
> }
>
> - nhoff += sizeof(*iph) + ipv6_exthdrs_len(iph, &ops);
> + if (!skb->encapsulation) {
> + ops = rcu_dereference(inet6_offloads[NAPI_GRO_CB(skb)->transport_proto]);
> + nhoff += NAPI_GRO_CB(skb)->network_len;
Why not use the same skb_network_header_len() here? Both
skb->network_header and skb->transport_header must be set and correct at
this point (if not, you can always fix that).
> + } else {
> + nhoff += sizeof(*iph) + ipv6_exthdrs_len(iph, &ops);
> + }
> +
> if (WARN_ON(!ops || !ops->callbacks.gro_complete))
> goto out;
>
Thanks,
Olek
next prev parent reply other threads:[~2023-01-30 15:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-30 13:00 [PATCH 0/2] gro: optimise redundant parsing of packets Richard Gobert
2023-01-30 13:05 ` [PATCH 1/2] gro: decrease size of CB Richard Gobert
2023-01-30 13:07 ` [PATCH 2/2] gro: optimise redundant parsing of packets Richard Gobert
2023-01-30 15:40 ` Alexander Lobakin [this message]
2023-02-22 14:47 ` Richard Gobert
2023-01-30 17:39 ` Eric Dumazet
2023-02-22 14:35 ` Richard Gobert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=836bc0f5-004b-3b7d-d0d7-b70b030977c6@intel.com \
--to=alexandr.lobakin@intel.com \
--cc=alexanderduyck@fb.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=iwienand@redhat.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lixiaoyan@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardbgobert@gmail.com \
--cc=steffen.klassert@secunet.com \
--cc=ye.xingchen@zte.com.cn \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).