From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABDB515A6 for ; Sat, 30 Sep 2023 11:09:12 +0000 (UTC) Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:237:300::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B670CA; Sat, 30 Sep 2023 04:09:10 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1qmXqA-0003rY-Oh; Sat, 30 Sep 2023 13:08:54 +0200 Date: Sat, 30 Sep 2023 13:08:54 +0200 From: Florian Westphal To: Yan Zhai Cc: netdev@vger.kernel.org, "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Aya Levin , Tariq Toukan , linux-kernel@vger.kernel.org, kernel-team@cloudflare.com Subject: Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets Message-ID: <20230930110854.GA13787@breakpoint.cc> References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Yan Zhai wrote: > GSO packets can contain a trailing segment that is smaller than > gso_size. When examining the dst MTU for such packet, if its gso_size > is too large, then all segments would be fragmented. However, there is a > good chance the trailing segment has smaller actual size than both > gso_size as well as the MTU, which leads to an "atomic fragment". > RFC-8021 explicitly recommend to deprecate such use case. An Existing > report from APNIC also shows that atomic fragments can be dropped > unexpectedly along the path [1]. > > Add an extra check in ip6_fragment to catch all possible generation of > atomic fragments. Skip atomic header if it is called on a packet no > larger than MTU. > > Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1] > Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing") > Reported-by: David Wragg > Signed-off-by: Yan Zhai > --- > net/ipv6/ip6_output.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index 951ba8089b5b..42f5f68a6e24 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb, > __be32 frag_id; > u8 *prevhdr, nexthdr = 0; > > + /* RFC-8021 recommended atomic fragments to be deprecated. Double check > + * the actual packet size before fragment it. > + */ > + mtu = ip6_skb_dst_mtu(skb); > + if (unlikely(skb->len <= mtu)) > + return output(net, sk, skb); > + This helper is also called for skbs where IP6CB(skb)->frag_max_size exceeds the MTU, so this check looks wrong to me. Same remark for dst_allfrag() check in __ip6_finish_output(), after this patch, it would be ignored. I think you should consider to first refactor __ip6_finish_output to make the existing checks more readable (e.g. handle gso vs. non-gso in separate branches) and then add the check to last seg in ip6_finish_output_gso_slowpath_drop(). Alternatively you might be able to pass more info down to ip6_fragment and move decisions there. In any case we should make same frag-or-no-frag decisions, regardless of this being the orig skb or a segmented one,