From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Greear Subject: Re: Performance regression between 4.13 and 4.14 Date: Wed, 9 May 2018 13:55:49 -0700 Message-ID: <3db3e973-67be-781e-5550-33eb82b7ed1e@candelatech.com> References: <9115910b-dd8b-e7f3-be53-f739b8382032@candelatech.com> <17a364e0-89c8-d4f6-3873-353c7dae4fba@candelatech.com> <988a00db-bab3-7318-5e31-2a7c0948262a@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: Eric Dumazet , netdev Return-path: Received: from mail2.candelatech.com ([208.74.158.173]:39172 "EHLO mail2.candelatech.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935598AbeEIUzw (ORCPT ); Wed, 9 May 2018 16:55:52 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 05/09/2018 12:02 PM, Ben Greear wrote: > On 05/09/2018 11:48 AM, Eric Dumazet wrote: >> >> >> On 05/09/2018 11:43 AM, Ben Greear wrote: >>> On 05/08/2018 10:10 AM, Eric Dumazet wrote: >>>> >>>> >>>> On 05/08/2018 09:44 AM, Ben Greear wrote: >>>>> Hello, >>>>> >>>>> I am trying to track down a performance regression that appears to be between 4.13 >>>>> and 4.14. >>>>> >>>>> I first saw the problem with a hacked version of pktgen on some ixgbe NICs. 4.13 can do >>>>> right at 10G bi-directional on two ports, and 4.14 and later can do only about 6Gbps. >>>>> >>>>> I also tried with user-space UDP traffic on a stock kernel, and I can get about 3.2Gbps combined tx+rx >>>>> on 4.14 and about 4.4Gbps on 4.13. >>>>> >>>>> Attempting to bisect seems to be triggering a weirdness in git, and also lots of commits >>>>> crash or do not bring up networking, which makes the bisect difficult. >>>>> >>>>> Looking at perf top, it would appear that some lock is probably to blame. >>>> >>>> >>>> perf record -a -g -e cycles:pp sleep 5 >>>> perf report >>>> >>>> Then you'll be able to tell us which lock (or call graph) is killing your perf. >>>> >>> >>> I seem to be chasing multiple issues. For 4.13, at least part of my problem was that LOCKDEP was enabled, >>> during my bisect, though it does NOT appear enabled in 4.16. I think maybe CONFIG_LOCKDEP moved to CONFIG_PROVE_LOCKING >>> in 4.16, or something like that? My 4.16 .config does have CONFIG_LOCKDEP_SUPPORT enabled, and I see no option to disable it: >>> >>> [greearb@ben-dt3 linux-4.16.x64]$ grep LOCKDEP .config >>> CONFIG_LOCKDEP_SUPPORT=y >>> >>> >>> For 4.16, I am disabling RETRAMPOLINE...are there any other such things I need >>> to disable to keep from getting a performance hit from the spectre-related bug >>> fixes? At this point, I do not care about the security implications. >>> >>> greearb@ben-dt3 linux-4.16.x64]$ grep RETPO .config >>> # CONFIG_RETPOLINE is not set >>> >>> >>> Thanks, >>> Ben >>> >> >> No idea really, you mention a 4.13 -> 4.14 regression and jump then to 4.16 :/ > > I initially saw the problem in 4.16, then bisected, and 4.14 still showed the > issue. So, I guess I must have been enabling lockdep the whole time. This __lock_acquire is from lockdep as far as I can tell, not normal locking. I re-built 4.16 after verifying as best as I could that lockdep was not enabled, and now it performs as expected. I'm going to test a patch to change __lock_acquire to __lock_acquire_lockdep so maybe someone else will not make the same mistake I made. > + 17.78% 17.78% kpktgend_1 [kernel.kallsyms] [k] __lock_acquire.isra.3 Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com