From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: Performance regressions in TCP_STREAM tests in Linux 4.15 (and later) Date: Mon, 30 Apr 2018 09:36:00 -0700 Message-ID: <7e1f00ad-d859-0aab-c953-d6da2efe11f0@gmail.com> References: <20180427231149.119db14c@vmware.local.home> <476bfc0f-eb2d-fe57-73d9-ec8a8392ad33@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: "netdev@vger.kernel.org" , Shilpi Agarwal , Boon Ang , Darren Hart , Steven Rostedt , Abdul Anshad Azeez To: Ben Greear , Steven Rostedt , Michael Wenig Return-path: Received: from mail-pf0-f196.google.com ([209.85.192.196]:39437 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754557AbeD3QgE (ORCPT ); Mon, 30 Apr 2018 12:36:04 -0400 Received: by mail-pf0-f196.google.com with SMTP id z9so7145223pfe.6 for ; Mon, 30 Apr 2018 09:36:04 -0700 (PDT) In-Reply-To: <476bfc0f-eb2d-fe57-73d9-ec8a8392ad33@candelatech.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 04/30/2018 09:14 AM, Ben Greear wrote: > On 04/27/2018 08:11 PM, Steven Rostedt wrote: >> >> We'd like this email archived in netdev list, but since netdev is >> notorious for blocking outlook email as spam, it didn't go through. So >> I'm replying here to help get it into the archives. >> >> Thanks! >> >> -- Steve >> >> >> On Fri, 27 Apr 2018 23:05:46 +0000 >> Michael Wenig wrote: >> >>> As part of VMware's performance testing with the Linux 4.15 kernel, >>> we identified CPU cost and throughput regressions when comparing to >>> the Linux 4.14 kernel. The impacted test cases are mostly TCP_STREAM >>> send tests when using small message sizes. The regressions are >>> significant (up 3x) and were tracked down to be a side effect of Eric >>> Dumazat's RB tree changes that went into the Linux 4.15 kernel. >>> Further investigation showed our use of the TCP_NODELAY flag in >>> conjunction with Eric's change caused the regressions to show and >>> simply disabling TCP_NODELAY brought performance back to normal. >>> Eric's change also resulted into significant improvements in our >>> TCP_RR test cases. >>> >>> >>> >>> Based on these results, our theory is that Eric's change made the >>> system overall faster (reduced latency) but as a side effect less >>> aggregation is happening (with TCP_NODELAY) and that results in lower >>> throughput. Previously even though TCP_NODELAY was set, system was >>> slower and we still got some benefit of aggregation. Aggregation >>> helps in better efficiency and higher throughput although it can >>> increase the latency. If you are seeing a regression in your >>> application throughput after this change, using TCP_NODELAY might >>> help bring performance back however that might increase latency. > > I guess you mean _disabling_ TCP_NODELAY instead of _using_ TCP_NODELAY? > Yeah, I guess auto-corking does not work as intended.