All of lore.kernel.org
 help / color / mirror / Atom feed
From: Edward Cree <ecree@solarflare.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	<linux-net-drivers@solarflare.com>,
	Tom Herbert <tom@herbertland.com>
Subject: Re: [PATCH v2 net-next 1/5] net: local checksum offload for encapsulation
Date: Fri, 8 Jan 2016 15:32:25 +0000	[thread overview]
Message-ID: <568FD689.3070300@solarflare.com> (raw)
In-Reply-To: <CAKgT0UfjGkjiXuV8EqdtwrunAS0MDSb2rCzkt+6F+yipGzK9sw@mail.gmail.com>

On 07/01/16 22:53, Alexander Duyck wrote:
> On Thu, Jan 7, 2016 at 9:12 AM, Edward Cree <ecree@solarflare.com> wrote:
>> The arithmetic properties of the ones-complement checksum mean that a
>>  correctly checksummed inner packet, including its checksum, has a ones
>>  complement sum depending only on whatever value was used to initialise
>>  the checksum field before checksumming (in the case of TCP and UDP,
>>  this is the ones complement sum of the pseudo header, complemented).
>> Consequently, if we are going to offload the inner checksum with
>>  CHECKSUM_PARTIAL, we can compute the outer checksum based only on the
>>  packed data not covered by the inner checksum, and the initial value of
>>  the inner checksum field.
>>
>> Signed-off-by: Edward Cree <ecree@solarflare.com>
>> ---
>>  include/linux/skbuff.h    | 26 ++++++++++++++++++++++++++
>>  net/ipv4/ip_tunnel_core.c |  4 ++++
>>  net/ipv4/udp.c            | 29 ++++++++++-------------------
>>  net/ipv6/ip6_checksum.c   | 24 ++++++++----------------
>>  4 files changed, 48 insertions(+), 35 deletions(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 6b6bd42..6590d08 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -3665,5 +3665,31 @@ static inline unsigned int skb_gso_network_seglen(const struct sk_buff *skb)
>>         return hdr_len + skb_gso_transport_seglen(skb);
>>  }
>>
>> +/* Local Checksum Offload.
>> + * Compute outer checksum based on the assumption that the
>> + * inner checksum will be offloaded later.
>> + * See Documentation/networking/tx-offloads.txt for
>> + * explanation of how this works.
>> + * Fill in outer checksum adjustment (e.g. with sum of outer
>> + * pseudo-header) before calling.
>> + * Also ensure that inner checksum is in linear data area.
>> + */
>> +static inline __wsum lco_csum(struct sk_buff *skb)
>> +{
>> +       char *inner_csum_field;
>> +       __wsum csum;
>> +
>> +       /* Start with complement of inner checksum adjustment */
>> +       inner_csum_field = skb->data + skb_checksum_start_offset(skb) +
>> +                               skb->csum_offset;
> You would probably benefit from caching off the result of
> skb_checksum_start_offset into a local variable so the compiler
> doesn't go through and recompute it when you call it again below.
It's a nearly-trivial inline function; won't the compiler be smart enough to
 cache that result itself?
>> +       csum = ~csum_unfold(*(__force __sum16 *)inner_csum_field);
> This seems like a lot of work, couldn't you get away with just
> bit-flipping this and moving it into uh->check on the outer header?
It's not a lot of work: all this does is zero-extend to 32 bits and flip.
It looks like more, but most of it is just a cast; it's written in this way
 to pacify sparse while using as little __force as possible.
lco_csum can't move it into uh->check, because it doesn't have uh.  In fact,
 the skb might not even be UDP - this function is intended to be used also
 for GRE, which has an IP-style checksum but in a different place.  (Next
 version of the patch series will implement that btw)
>> +       /* Add in checksum of our headers (incl. outer checksum
>> +        * adjustment filled in by caller)
>> +        */
>> +       csum = skb_checksum(skb, 0, skb_checksum_start_offset(skb), csum);
>> +       /* The result is the outer checksum */
>> +       return csum;
>> +}
>> +
> The more I think about it I am not sure how much there is to be gained
> by having this as a separate function anyway since I think you might
> be able to better exploit things with a few changes to the ordering of
> operations.  See my notes below in the IPv4 section.
>
>>  #endif /* __KERNEL__ */
>>  #endif /* _LINUX_SKBUFF_H */
>> diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
>> index 1db8418..f39b064 100644
>> --- a/net/ipv4/ip_tunnel_core.c
>> +++ b/net/ipv4/ip_tunnel_core.c
>> @@ -146,6 +146,10 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
>>  }
>>  EXPORT_SYMBOL_GPL(iptunnel_metadata_reply);
>>
>> +/* csum_help should only be ever true if the protocol doesn't support LCO.
>> + * If the tunnel uses udp_tunnel_xmit_skb(), then it gets LCO for free, and
>> + * should always set csum_help to false.
>> + */
>>  struct sk_buff *iptunnel_handle_offloads(struct sk_buff *skb,
>>                                          bool csum_help,
>>                                          int gso_type_mask)
>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>> index 8841e98..c1c73be 100644
>> --- a/net/ipv4/udp.c
>> +++ b/net/ipv4/udp.c
>> @@ -767,32 +767,23 @@ void udp_set_csum(bool nocheck, struct sk_buff *skb,
>>  {
>>         struct udphdr *uh = udp_hdr(skb);
>>
>> -       if (nocheck)
>> +       if (nocheck) {
>>                 uh->check = 0;
>> -       else if (skb_is_gso(skb))
>> +       } else if (skb_is_gso(skb)) {
>>                 uh->check = ~udp_v4_check(len, saddr, daddr, 0);
>> -       else if (skb_dst(skb) && skb_dst(skb)->dev &&
>> -                (skb_dst(skb)->dev->features &
>> -                 (NETIF_F_IP_CSUM | NETIF_F_HW_CSUM))) {
>> -
>> -               BUG_ON(skb->ip_summed == CHECKSUM_PARTIAL);
>> +       } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
>> +               __wsum csum;
>>
> I wonder if this shouldn't be made a check that is in addition to the
> two options below instead of completely replacing them.  The question
> I would have is if there are any cases where we need to follow the
> path that results in the CHECKSUM_UNNECESSARY being set.
I don't think there can be such a case.
Either: we've already set up PARTIAL for an inner header, so we can
 definitely do LCO.
Or: we haven't set up PARTIAL yet, so we can use that now.  If the device
 doesn't support it, it'll get fixed up later when we validate_xmit_skb().
So there's no way (AFAICT) that we'd ever not be able to use PARTIAL.
Unless - hmmm - what happens if we've set up PARTIAL for a CRC rather than
 an IP checksum?  However, it looks to me as if in that case the old code
 would have screwed up when iptunnel_handle_offloads() would do the inner
 csum in skb_checksum_help() and would do it as an IP checksum.  So I'm
 guessing this probably can't happen.  Or it's already broken and so my
 patch won't make it any worse ;)

However, the next version of the patch series will split this change out
 from the rest of the patch, as Tom Herbert suggested.
>
>> +               uh->check = ~udp_v4_check(len, saddr, daddr, 0);
>> +               csum = lco_csum(skb);
>> +               uh->check = csum_fold(csum);
>> +               if (uh->check == 0)
>> +                       uh->check = CSUM_MANGLED_0;
>
> You would probably benefit from reordering this to something like what
> we have in the last block below this one.  The idea is then you only
> halve to fold things once and can avoid some unnecessary duplication.
>
> So you could code it up with something like:
>   __wsum csum;
>   int start_offset;
>
>   start_offset = skb_checksum_start_offset(skb);
>   uh->check = ~(*(__sum16 *)(skb->data + start_offset + skb->csum_offset));
>   csum = skb_checksum(skb, 0, start_offset, 0);
>   uh->check = udp_v4_check(len, saddr, daddr, csum);
>   if (uh->check == 0)
>     uh->check = CSUM_MANGLED_0;
>
> Forgive the formatting, my email client mangles tabs badly.  By using
> the pseudo header checksum from the inner header for the starting
> value and then computing the udp_v4_check for the outer header last
> you save yourself a few cycles since you only have to fold the
> checksum once instead of once for the pseudo-header and again for the
> final result.
Hmm, I think we can do this without losing the helper function (which will
 be shared not just by UDPv4 and UDPv6 but also GRE).
Something like this:
  uh->check = 0;
  uh->check = ~udp_v4_check(len, saddr, daddr, lco_csum(skb));
  if (uh->check == 0)
    uh->check = CSUM_MANGLED_0;
Now the only fold is the one udp_v4_check() does.
Would that shave off enough cycles to satisfy?

  reply	other threads:[~2016-01-08 15:32 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-07 17:10 [PATCH v2 net-next 0/5] Local Checksum Offload Edward Cree
2016-01-07 17:12 ` [PATCH v2 net-next 1/5] net: local checksum offload for encapsulation Edward Cree
2016-01-07 17:22   ` David Laight
2016-01-07 17:54     ` Edward Cree
2016-01-07 18:42   ` Tom Herbert
2016-01-07 22:53   ` Alexander Duyck
2016-01-08 15:32     ` Edward Cree [this message]
2016-01-08 17:30       ` Alexander Duyck
2016-01-07 17:12 ` [PATCH v2 net-next 2/5] net: enable LCO for udp_tunnel_handle_offloads() users Edward Cree
2016-01-07 17:12 ` [PATCH v2 net-next 3/5] net: vxlan: enable local checksum offload Edward Cree
2016-01-08  0:15   ` Alexander Duyck
2016-01-08 15:33     ` Edward Cree
2016-01-08  3:46   ` Alexander Duyck
2016-01-08 15:39     ` Edward Cree
2016-01-08 18:03       ` Alexander Duyck
2016-01-08 19:40         ` Jesse Gross
2016-01-08 21:22           ` Alexander Duyck
2016-01-08 21:36             ` Rick Jones
2016-01-08 22:07               ` Tom Herbert
2016-01-11 17:24             ` Jesse Gross
2016-01-11 17:55               ` Tom Herbert
2016-01-11 18:27                 ` Edward Cree
2016-01-11 18:43                   ` Tom Herbert
2016-01-07 17:13 ` [PATCH v2 net-next 4/5] fou: enable LCO in FOU and GUE Edward Cree
2016-01-07 18:51   ` Tom Herbert
2016-01-07 19:00     ` Edward Cree
2016-01-07 17:14 ` [PATCH v2 net-next 5/5] Documentation/networking: add tx-offloads.txt to explain LCO Edward Cree
2016-01-07 18:58   ` Tom Herbert
2016-01-11 17:05 ` [RFC PATCH 0/2] Rework of "net: local checksum offload for encapsulation" Alexander Duyck
2016-01-11 17:06   ` [RFC PATCH 1/2] net: local checksum offload for encapsulation Alexander Duyck
2016-01-11 17:06   ` [RFC PATCH 2/2] net: Add support for UDP local checksum offload as a part of tunnel segmentation Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568FD689.3070300@solarflare.com \
    --to=ecree@solarflare.com \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=linux-net-drivers@solarflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.