From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751325AbcEJRXi (ORCPT ); Tue, 10 May 2016 13:23:38 -0400 Received: from merlin.infradead.org ([205.233.59.134]:58698 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750864AbcEJRXh (ORCPT ); Tue, 10 May 2016 13:23:37 -0400 Date: Tue, 10 May 2016 19:23:13 +0200 From: Peter Zijlstra To: Borislav Petkov Cc: x86-ml , Denys Vlasenko , "H. Peter Anvin" , Brian Gerst , LKML , Dmitry Vyukov , Andi Kleen , zengzhaoxiu@163.com, Thomas Gleixner , Ingo Molnar , Andrew Morton , Kees Cook , Zhaoxiu Zeng , Andy Lutomirski Subject: Re: [PATCH -v2] x86/hweight: Get rid of the special calling convention Message-ID: <20160510172313.GA3192@twins.programming.kicks-ass.net> References: <20160407094333.GD3866@pd.tnic> <20160504184612.GC23257@pd.tnic> <5998407c-3497-22c1-45dc-a86afcb73c94@zytor.com> <20160504194101.GE23257@pd.tnic> <20160504202213.GF23257@pd.tnic> <572B446D.1030000@redhat.com> <20160505140446.GE534@pd.tnic> <20160510165318.GD28520@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160510165318.GD28520@pd.tnic> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote: > static __always_inline unsigned int __arch_hweight32(unsigned int w) > { > - unsigned int res = 0; > + unsigned int res; > > - asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT) > - : "="REG_OUT (res) > - : REG_IN (w)); > + if (likely(static_cpu_has(X86_FEATURE_POPCNT))) { > + /* popcnt %eax, %eax */ > + asm volatile(POPCNT32 > + : "="REG_OUT (res) > + : REG_IN (w)); > > - return res; > + return res; > + } > + return __sw_hweight32(w); > } So what was wrong with using the normal thunk_*.S wrappers for the calls? That would allow you to use the alternative() stuff which does generate smaller code.