From mboxrd@z Thu Jan 1 00:00:00 1970 From: subashab@codeaurora.org Subject: Re: [net PATCH v2 2/2] ipv4/GRO: Make GRO conform to RFC 6864 Date: Mon, 04 Apr 2016 18:38:37 -0600 Message-ID: <98efba2f85ac01ba988c62398d31bb77@codeaurora.org> References: <20160404162545.14332.653.stgit@localhost.localdomain> <20160404162818.14332.1076.stgit@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Cc: herbert@gondor.apana.org.au, tom@herbertland.com, jesse@kernel.org, alexander.duyck@gmail.com, edumazet@google.com, netdev@vger.kernel.org, davem@davemloft.net, netdev-owner@vger.kernel.org To: Alexander Duyck Return-path: Received: from smtp.codeaurora.org ([198.145.29.96]:47951 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752880AbcDEAim (ORCPT ); Mon, 4 Apr 2016 20:38:42 -0400 In-Reply-To: <20160404162818.14332.1076.stgit@localhost.localdomain> Sender: netdev-owner@vger.kernel.org List-ID: On 2016-04-04 10:31, Alexander Duyck wrote: > RFC 6864 states that the IPv4 ID field MUST NOT be used for purposes > other > than fragmentation and reassembly. Currently we are looking at this > field > as a way of identifying what frames can be aggregated and which cannot > for > GRO. While this is valid for frames that do not have DF set, it is > invalid > to do so if the bit is set. > > In addition we were generating IPv4 ID collisions when 2 or more flows > were > interleaved over the same tunnel. To prevent that we store the result > of > all IP ID checks via a "|=" instead of overwriting previous values. > > With this patch we support two different approaches for the IP ID > field. > The first is a non-incrementing IP ID with DF bit set. In such a case > we > simply won't write to the flush_id field in the GRO context block. The > other option is the legacy option in which the IP ID must increment by > 1 > for every packet we aggregate. > > In the case of the non-incrementing IP ID we will end up losing the > data > that the IP ID is fixed. However as per RFC 6864 we should be able to > write any value into the IP ID when the DF bit is set so this should > cause > minimal harm. > > Signed-off-by: Alexander Duyck > --- > > v2: Updated patch so that we now only support one of two options. > Either > the IP ID is fixed with DF bit set, or the IP ID is incrementing. > That > allows us to support the fixed ID case as occurs with IPv6 to IPv4 > header translation and what is likely already out there for some > devices with tunnel headers. > > net/core/dev.c | 1 + > net/ipv4/af_inet.c | 25 ++++++++++++++++++------- > net/ipv6/ip6_offload.c | 3 --- > 3 files changed, 19 insertions(+), 10 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 77a71cd68535..3429632398a4 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4352,6 +4352,7 @@ static void gro_list_prepare(struct napi_struct > *napi, struct sk_buff *skb) > unsigned long diffs; > > NAPI_GRO_CB(p)->flush = 0; > + NAPI_GRO_CB(p)->flush_id = 0; > > if (hash != skb_get_hash_raw(p)) { > NAPI_GRO_CB(p)->same_flow = 0; > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c > index 9e481992dbae..33f6335448a2 100644 > --- a/net/ipv4/af_inet.c > +++ b/net/ipv4/af_inet.c > @@ -1324,6 +1324,7 @@ static struct sk_buff **inet_gro_receive(struct > sk_buff **head, > > for (p = *head; p; p = p->next) { > struct iphdr *iph2; > + u16 flush_id; > > if (!NAPI_GRO_CB(p)->same_flow) > continue; > @@ -1347,14 +1348,24 @@ static struct sk_buff > **inet_gro_receive(struct sk_buff **head, > (iph->tos ^ iph2->tos) | > ((iph->frag_off ^ iph2->frag_off) & htons(IP_DF)); > > - /* Save the IP ID check to be included later when we get to > - * the transport layer so only the inner most IP ID is checked. > - * This is because some GSO/TSO implementations do not > - * correctly increment the IP ID for the outer hdrs. > - */ > - NAPI_GRO_CB(p)->flush_id = > - ((u16)(ntohs(iph2->id) + NAPI_GRO_CB(p)->count) ^ id); > NAPI_GRO_CB(p)->flush |= flush; > + > + /* We must save the offset as it is possible to have multiple > + * flows using the same protocol and address pairs so we > + * need to wait until we can validate this is part of the > + * same flow with a 5-tuple or better to avoid unnecessary > + * collisions between flows. We can support one of two > + * possible scenarios, either a fixed value with DF bit set > + * or an incrementing value with DF either set or unset. > + * In the case of a fixed value we will end up losing the > + * data that the IP ID was a fixed value, however per RFC > + * 6864 in such a case the actual value of the IP ID is > + * meant to be ignored anyway. > + */ > + flush_id = (u16)(id - ntohs(iph2->id)); > + if (flush_id || !(iph2->frag_off & htons(IP_DF))) > + NAPI_GRO_CB(p)->flush_id |= flush_id ^ > + NAPI_GRO_CB(p)->count; > } > > NAPI_GRO_CB(skb)->flush |= flush; > diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c > index 82e9f3076028..9aa53f64dffd 100644 > --- a/net/ipv6/ip6_offload.c > +++ b/net/ipv6/ip6_offload.c > @@ -238,9 +238,6 @@ static struct sk_buff **ipv6_gro_receive(struct > sk_buff **head, > /* flush if Traffic Class fields are different */ > NAPI_GRO_CB(p)->flush |= !!(first_word & htonl(0x0FF00000)); > NAPI_GRO_CB(p)->flush |= flush; > - > - /* Clear flush_id, there's really no concept of ID in IPv6. */ > - NAPI_GRO_CB(p)->flush_id = 0; > } > > NAPI_GRO_CB(skb)->flush |= flush; I can see coalescing of packets which have IP ID=0 and DF=1 originating from a IPv6 to IPv4 translator. Tested-by: Subash Abhinov Kasiviswanathan