From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762806AbdJRI5N (ORCPT ); Wed, 18 Oct 2017 04:57:13 -0400 Received: from mga05.intel.com ([192.55.52.43]:46449 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757910AbdJRI5K (ORCPT ); Wed, 18 Oct 2017 04:57:10 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,395,1503385200"; d="scan'208";a="1207079143" Date: Wed, 18 Oct 2017 16:57:11 +0800 From: Aaron Lu To: Vlastimil Babka Cc: "akpm@linux-foundation.org" , "linux-kernel@vger.kernel.org" , "tim.c.chen@linux.intel.com" , "khandual@linux.vnet.ibm.com" , "linux-mm@kvack.org" , "ak@linux.intel.com" , "Wang, Kemi" , "Hansen, Dave" , "Huang, Ying" Subject: Re: [PATCH] mm/page_alloc: make sure __rmqueue() etc. always inline Message-ID: <20171018085711.GC1753@intel.com> References: <20171010025151.GD1798@intel.com> <20171010025601.GE1798@intel.com> <8d6a98d3-764e-fd41-59dc-88a9d21822c7@intel.com> <20171010054342.GF1798@intel.com> <20171010144545.c87a28b0f3c4e475305254ab@linux-foundation.org> <20171011023402.GC27907@intel.com> <20171013063111.GA26032@intel.com> <7304b3a4-d6cb-63fa-743d-ea8e7b126e32@suse.cz> <1508291629.14336.14.camel@intel.com> <29e5343f-b352-fe6a-02a8-74955cd606b8@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <29e5343f-b352-fe6a-02a8-74955cd606b8@suse.cz> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 18, 2017 at 08:28:56AM +0200, Vlastimil Babka wrote: > On 10/18/2017 03:53 AM, Lu, Aaron wrote: > > On Tue, 2017-10-17 at 13:32 +0200, Vlastimil Babka wrote: > >> With gcc 7.2.1: > >>> ./scripts/bloat-o-meter base.o mm/page_alloc.o > >> > >> add/remove: 1/2 grow/shrink: 2/0 up/down: 2493/-1649 (844) > > > > Nice, it clearly showed 844 bytes bloat. > > > >> function old new delta > >> get_page_from_freelist 2898 4937 +2039 > >> steal_suitable_fallback - 365 +365 > >> find_suitable_fallback 31 120 +89 > >> find_suitable_fallback.part 115 - -115 > >> __rmqueue 1534 - -1534 > > It also shows that steal_suitable_fallback() is no longer inlined. Which > is fine, because that should ideally be rarely executed. Ah right, so this script is really good for analysing inline changes. > > >> > >>> [aaron@aaronlu obj]$ size */*/vmlinux > >>> text data bss dec hex filename > >>> 10342757 5903208 17723392 33969357 20654cd gcc-4.9.4/base/vmlinux > >>> 10342757 5903208 17723392 33969357 20654cd gcc-4.9.4/head/vmlinux > >>> 10332448 5836608 17715200 33884256 2050860 gcc-5.5.0/base/vmlinux > >>> 10332448 5836608 17715200 33884256 2050860 gcc-5.5.0/head/vmlinux > >>> 10094546 5836696 17715200 33646442 201676a gcc-6.4.0/base/vmlinux > >>> 10094546 5836696 17715200 33646442 201676a gcc-6.4.0/head/vmlinux > >>> 10018775 5828732 17715200 33562707 2002053 gcc-7.2.0/base/vmlinux > >>> 10018775 5828732 17715200 33562707 2002053 gcc-7.2.0/head/vmlinux > >>> > >>> Text size for vmlinux has no change though, probably due to function > >>> alignment. > >> > >> Yep that's useless to show. These differences do add up though, until > >> they eventually cross the alignment boundary. > > > > Agreed. > > But you know, it is the hot path, the performance improvement might be > > worth it. > > I'd agree, so you can add > > Acked-by: Vlastimil Babka Thanks!