From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932103AbbHEUOc (ORCPT ); Wed, 5 Aug 2015 16:14:32 -0400 Received: from terminus.zytor.com ([198.137.202.10]:50813 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753322AbbHEUOa (ORCPT ); Wed, 5 Aug 2015 16:14:30 -0400 Date: Wed, 5 Aug 2015 13:13:50 -0700 From: tip-bot for Denys Vlasenko Message-ID: Cc: peterz@infradead.org, hpa@zytor.com, torvalds@linux-foundation.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, mingo@kernel.org, tgraf@suug.ch, rientjes@google.com, dvlasenk@redhat.com, akpm@linux-foundation.org Reply-To: rientjes@google.com, dvlasenk@redhat.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, mingo@kernel.org, tgraf@suug.ch, tglx@linutronix.de, torvalds@linux-foundation.org, hpa@zytor.com, peterz@infradead.org In-Reply-To: <1438697716-28121-2-git-send-email-dvlasenk@redhat.com> References: <1438697716-28121-2-git-send-email-dvlasenk@redhat.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:core/types] x86/hweight: Force inlining of __arch_hweight{32 ,64}() Git-Commit-ID: d14edb1648221e59fc9fd47127fcc57bf26d759f X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: d14edb1648221e59fc9fd47127fcc57bf26d759f Gitweb: http://git.kernel.org/tip/d14edb1648221e59fc9fd47127fcc57bf26d759f Author: Denys Vlasenko AuthorDate: Tue, 4 Aug 2015 16:15:15 +0200 Committer: Ingo Molnar CommitDate: Wed, 5 Aug 2015 09:38:09 +0200 x86/hweight: Force inlining of __arch_hweight{32,64}() With this config: http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os gcc-4.7.2 generates many copies of these tiny functions: __arch_hweight32 (35 copies): 55 push %rbp e8 66 9b 4a 00 callq __sw_hweight32 48 89 e5 mov %rsp,%rbp 5d pop %rbp c3 retq __arch_hweight64 (8 copies): 55 push %rbp e8 5e c2 8a 00 callq __sw_hweight64 48 89 e5 mov %rsp,%rbp 5d pop %rbp c3 retq See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 This patch fixes this via s/inline/__always_inline/ To avoid touching 32-bit case where such change was not tested to be a win, reformat __arch_hweight64() to have completely disjoint 64-bit and 32-bit implementations. IOW: made #ifdef / 32 bits and 64 bits instead of having #ifdef / #else / #endif inside a single function body. Only 64-bit __arch_hweight64() is __always_inline'd. text data bss dec filename 86971120 17195912 36659200 140826232 vmlinux.before 86970954 17195912 36659200 140826066 vmlinux Signed-off-by: Denys Vlasenko Cc: Andrew Morton Cc: David Rientjes Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Thomas Graf Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1438697716-28121-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/arch_hweight.h | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h index 9686c3d..259a7c1 100644 --- a/arch/x86/include/asm/arch_hweight.h +++ b/arch/x86/include/asm/arch_hweight.h @@ -21,7 +21,7 @@ * ARCH_HWEIGHT_CFLAGS in for the respective * compiler switches. */ -static inline unsigned int __arch_hweight32(unsigned int w) +static __always_inline unsigned int __arch_hweight32(unsigned int w) { unsigned int res = 0; @@ -42,20 +42,23 @@ static inline unsigned int __arch_hweight8(unsigned int w) return __arch_hweight32(w & 0xff); } +#ifdef CONFIG_X86_32 static inline unsigned long __arch_hweight64(__u64 w) { - unsigned long res = 0; - -#ifdef CONFIG_X86_32 return __arch_hweight32((u32)w) + __arch_hweight32((u32)(w >> 32)); +} #else +static __always_inline unsigned long __arch_hweight64(__u64 w) +{ + unsigned long res = 0; + asm (ALTERNATIVE("call __sw_hweight64", POPCNT64, X86_FEATURE_POPCNT) : "="REG_OUT (res) : REG_IN (w)); -#endif /* CONFIG_X86_32 */ return res; } +#endif /* CONFIG_X86_32 */ #endif