From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEDA5C5B57D for ; Wed, 3 Jul 2019 03:27:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B2F3E21721 for ; Wed, 3 Jul 2019 03:27:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727142AbfGCD1X (ORCPT ); Tue, 2 Jul 2019 23:27:23 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:43041 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727049AbfGCD1X (ORCPT ); Tue, 2 Jul 2019 23:27:23 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=tonylu@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0TVvQlX0_1562124439; Received: from localhost(mailfrom:tonylu@linux.alibaba.com fp:SMTPD_---0TVvQlX0_1562124439) by smtp.aliyun-inc.com(127.0.0.1); Wed, 03 Jul 2019 11:27:19 +0800 Date: Wed, 3 Jul 2019 11:27:18 +0800 From: Tony Lu To: Eric Dumazet Cc: "David S . Miller" , netdev , Eric Dumazet , Christoph Paasch , oliver.yang@linux.alibaba.com, xlpang@linux.alibaba.com, dust.li@linux.alibaba.com Subject: Re: [PATCH net] tcp: refine memory limit test in tcp_fragment() Message-ID: <20190703032718.GC55248@TonyMac-Alibaba> Reply-To: Tony Lu References: <20190621130955.147974-1-edumazet@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190621130955.147974-1-edumazet@google.com> User-Agent: Mutt/1.12.1 (2019-06-15) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hello Eric, We have applied that commit e358f4af19db ("tcp: tcp_fragment() should apply sane memory limits") as a hotpatch in production environment. We found that it will make tcp long connection reset during sending out packet when applying that commit. Our applications which in A/B test have suffered that and made them retransmit large data, and then caused retransmission storm and lower the performance and increase RT. Therefore we discontinued to apply this hotpatch in A/B test. After invesgation, we found this patch already fix this issue in stable. Before applying this patch, we have some questions: 1. This commit in stable hard coded a magic number 0x20000. I am wondering this value and if there any better solution. 2. Is there any known or unknown side effect? If any, we could test it in some suspicious scenarios before testing in prod env. Thanks. Cheers, Tony Lu On Fri, Jun 21, 2019 at 06:09:55AM -0700, Eric Dumazet wrote: > tcp_fragment() might be called for skbs in the write queue. > > Memory limits might have been exceeded because tcp_sendmsg() only > checks limits at full skb (64KB) boundaries. > > Therefore, we need to make sure tcp_fragment() wont punish applications > that might have setup very low SO_SNDBUF values. > > Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits") > Signed-off-by: Eric Dumazet > Reported-by: Christoph Paasch > --- > net/ipv4/tcp_output.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 00c01a01b547ec67c971dc25a74c9258563cf871..0ebc33d1c9e5099d163a234930e213ee35e9fbd1 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue, > if (nsize < 0) > nsize = 0; > > - if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) { > + if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf && > + tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) { > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG); > return -ENOMEM; > } > -- > 2.22.0.410.gd8fdbe21b5-goog