linux-kbuild.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Michal Marek <mmarek@suse.cz>,
	linux-kbuild <linux-kbuild@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Wu Fengguang <fengguang.wu@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Jamie Lokier <jamie@shareable.org>,
	Roland Dreier <rdreier@cisco.com>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>, Brian Gerst <brgerst@gmail.com>
Subject: Re: [PATCH] x86: Add optimized popcnt variants
Date: Fri, 19 Feb 2010 08:06:07 -0800	[thread overview]
Message-ID: <4B7EB6EF.9010405@zytor.com> (raw)
In-Reply-To: <20100219142205.GA32533@aftab>

On 02/19/2010 06:22 AM, Borislav Petkov wrote:
> --- /dev/null
> +++ b/arch/x86/lib/hweight.c
> @@ -0,0 +1,62 @@
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/bitops.h>
> +
> +#ifdef CONFIG_64BIT
> +/* popcnt %rdi, %rax */
> +#define POPCNT ".byte 0xf3\n\t.byte 0x48\n\t.byte 0x0f\n\t.byte 0xb8\n\t.byte 0xc7"
> +#define REG_IN "D"
> +#define REG_OUT "a"
> +#else
> +/* popcnt %eax, %eax */
> +#define POPCNT ".byte 0xf3\n\t.byte 0x0f\n\t.byte 0xb8\n\t.byte 0xc0"
> +#define REG_IN "a"
> +#define REG_OUT "a"
> +#endif
> +
> +/*
> + * __sw_hweightXX are called from within the alternatives below
> + * and callee-clobbered registers need to be taken care of. See
> + * ARCH_HWEIGHT_CFLAGS in <arch/x86/Kconfig> for the respective
> + * compiler switches.
> + */
> +unsigned int __arch_hweight32(unsigned int w)
> +{
> +	unsigned int res = 0;
> +
> +	asm (ALTERNATIVE("call __sw_hweight32", POPCNT, X86_FEATURE_POPCNT)
> +		     : "="REG_OUT (res)
> +		     : REG_IN (w));
> +
> +	return res;
> +}
> +EXPORT_SYMBOL(__arch_hweight32);
> +
> +unsigned int __arch_hweight16(unsigned int w)
> +{
> +	return __arch_hweight32(w & 0xffff);
> +}
> +EXPORT_SYMBOL(__arch_hweight16);
> +
> +unsigned int __arch_hweight8(unsigned int w)
> +{
> +	return __arch_hweight32(w & 0xff);
> +}
> +EXPORT_SYMBOL(__arch_hweight8);
> +
> +unsigned long __arch_hweight64(__u64 w)
> +{
> +	unsigned long res = 0;
> +
> +#ifdef CONFIG_X86_32
> +	return  __arch_hweight32((u32)w) +
> +		__arch_hweight32((u32)(w >> 32));
> +#else
> +	asm (ALTERNATIVE("call __sw_hweight64", POPCNT, X86_FEATURE_POPCNT)
> +		     : "="REG_OUT (res)
> +		     : REG_IN (w));
> +#endif /* CONFIG_X86_32 */
> +
> +	return res;
> +}

You're still not inlining these.  They should be: there is absolutely no
reason for code size to not inline them anymore.

> diff --git a/include/asm-generic/bitops/arch_hweight.h b/include/asm-generic/bitops/arch_hweight.h
> index 3a7be84..1c82306 100644
> --- a/include/asm-generic/bitops/arch_hweight.h
> +++ b/include/asm-generic/bitops/arch_hweight.h
> @@ -3,9 +3,23 @@
>  
>  #include <asm/types.h>
>  
> -extern unsigned int __arch_hweight32(unsigned int w);
> -extern unsigned int __arch_hweight16(unsigned int w);
> -extern unsigned int __arch_hweight8(unsigned int w);
> -extern unsigned long __arch_hweight64(__u64 w);
> +unsigned int __arch_hweight32(unsigned int w)
> +{
> +	return __sw_hweight32(w);
> +}
>  
> +unsigned int __arch_hweight16(unsigned int w)
> +{
> +	return __sw_hweight16(w);
> +}
> +
> +unsigned int __arch_hweight8(unsigned int w)
> +{
> +	return __sw_hweight8(w);
> +}
> +
> +unsigned long __arch_hweight64(__u64 w)
> +{
> +	return __sw_hweight64(w);
> +}
>  #endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */

and these are in a header file and *definitely* should be inlines.

	-hpa
-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


  reply	other threads:[~2010-02-19 16:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4B6C93A2.1090302@zytor.com>
     [not found] ` <20100206093659.GA28326@aftab>
     [not found]   ` <4B6E1DA3.50204@zytor.com>
     [not found]     ` <20100208092845.GB12618@a1.tnic>
     [not found]       ` <4B6FDAED.9060204@zytor.com>
     [not found]         ` <20100208095945.GA14740@a1.tnic>
     [not found]           ` <20100211172424.GB19779@aftab>
     [not found]             ` <4B743F7D.3090605@zytor.com>
     [not found]               ` <20100212170649.GC3114@aftab>
     [not found]                 ` <4B758FC0.1020600@zytor.com>
     [not found]                   ` <20100212174751.GD3114@aftab>
2010-02-12 19:05                     ` [PATCH 2/5] bitops: compile time optimization for hweight_long(CONSTANT) H. Peter Anvin
2010-02-17 13:57                       ` Michal Marek
2010-02-17 17:20                         ` Borislav Petkov
2010-02-17 17:31                           ` Michal Marek
2010-02-17 17:34                             ` Borislav Petkov
2010-02-17 17:39                             ` Michal Marek
2010-02-18  6:19                               ` Borislav Petkov
2010-02-19 14:22                                 ` [PATCH] x86: Add optimized popcnt variants Borislav Petkov
2010-02-19 16:06                                   ` H. Peter Anvin [this message]
2010-02-19 16:45                                     ` Borislav Petkov
2010-02-19 16:53                                       ` H. Peter Anvin
2010-02-22 14:17                                         ` Borislav Petkov
2010-02-22 17:21                                           ` H. Peter Anvin
2010-02-22 18:49                                             ` Borislav Petkov
2010-02-22 19:55                                               ` H. Peter Anvin
2010-02-23  6:37                                                 ` Borislav Petkov
2010-02-23 15:58                                                 ` Borislav Petkov
2010-02-23 17:34                                                   ` H. Peter Anvin
2010-02-23 17:54                                                     ` Borislav Petkov
2010-02-23 18:17                                                       ` H. Peter Anvin
2010-02-23 19:06                                                         ` Borislav Petkov
2010-02-26  5:27                                                           ` H. Peter Anvin
2010-02-26  7:47                                                             ` Borislav Petkov
2010-02-26 17:48                                                               ` H. Peter Anvin
2010-02-27  8:28                                                                 ` Borislav Petkov
2010-02-27 20:00                                                                   ` H. Peter Anvin
2010-03-09 15:36                                                                     ` Borislav Petkov
2010-03-09 15:50                                                                       ` Peter Zijlstra
2010-03-09 16:23                                                                         ` Borislav Petkov
2010-03-09 16:32                                                                           ` Peter Zijlstra
2010-03-09 17:32                                                                             ` Borislav Petkov
2010-03-09 17:37                                                                               ` Peter Zijlstra
2010-03-18 11:17                                                                     ` Borislav Petkov
2010-03-18 11:19                                                                     ` [PATCH 1/2] bitops: Optimize hweight() by making use of compile-time evaluation Borislav Petkov
2010-03-18 11:20                                                                     ` [PATCH 2/2] x86: Add optimized popcnt variants Borislav Petkov
2010-02-18 10:51                         ` [PATCH 2/5] bitops: compile time optimization for hweight_long(CONSTANT) Peter Zijlstra
2010-02-18 11:51                           ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B7EB6EF.9010405@zytor.com \
    --to=hpa@zytor.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@amd64.org \
    --cc=brgerst@gmail.com \
    --cc=fengguang.wu@intel.com \
    --cc=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mmarek@suse.cz \
    --cc=peterz@infradead.org \
    --cc=rdreier@cisco.com \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).