public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Borislav Petkov <petkovbb@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Borislav Petkov <borislav.petkov@amd.com>,
	greg@kroah.com, mingo@elte.hu, norsk5@yahoo.com,
	tglx@linutronix.de, mchehab@redhat.com, aris@redhat.com,
	edt@aei.ca, linux-kernel@vger.kernel.org,
	randy.dunlap@oracle.com
Cc: Sam Ravnborg <sam@ravnborg.org>
Subject: Re: [PATCH 0/4] amd64_edac: misc fixes
Date: Mon, 01 Jun 2009 09:54:41 -0700	[thread overview]
Message-ID: <4A2407D1.5050706@zytor.com> (raw)
In-Reply-To: <20090601145326.GA28260@liondog.tnic>

Borislav Petkov wrote:
> 
> How about we pin the src/dst into a register:
> 
> #define popcnt_spelled(x)                                       \
> ({                                                              \
>         typeof(x) __ret;                                        \
>         __asm__(".byte 0xf3\n\t.byte 0x48\n\t.byte 0x0f\n\t"    \
>                 ".byte 0xb8\n\t.byte 0xc0\n\t"                  \
>                 : "=a" (__ret)                                  \
>                 : "0" (x));                                     \
>         __ret;                                                  \
> })
> 
> which generates
> 
>   40055e:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>   400562:       f3 48 0f b8 c0          popcnt %rax,%rax
>   400567:       48 89 45 f8             mov    %rax,-0x8(%rbp)
> 
> here.
> 

Yes, we would have to do something like that.

However, if you're doing that you shouldn't use typeof() there...
instead this should be turned into an inline function with explicit
64-bit types.

It would be good if we could get Kbuild to export some kind of macro
that we can use to test binutils version, so we can do something like:

#if BINUTILS_VERSION >= KERNEL_VERSION(2,18,50)
/* Do the right thing */
#else
/* Do the wrong thing */
#endif

> For < 64bit operand sizes, the operands get zero-extended so that
> garbage in the high 32/48 bits of %rax doesn't corrupt the result.
> We might even want to do the movzwq explicitly so that some compiler
> doesn't decide to take the version with the "0f b6" opcode which
> zero-extends only the 16-/32-bit register. This way, you can popcnt even
> single bytes although the popcnt implementation doesn't allow single
> byte operands.
> 
>   400572:       0f b7 45 f2             movzwl -0xe(%rbp),%eax
>   400579:       f3 48 0f b8 c0          popcnt %rax,%rax
>   40057e:       66 89 45 f6             mov    %ax,-0xa(%rbp)
> 
> 
> So, in addition to popcnt itself, we have two movs added. This is still
> less than the 30+ ops (+ function call overhead) that hweight* get
> translated into. I'll redo my kernel build benchmarks tomorrow to get
> some more recent numbers on the performance gain.

With explicit types, the compiler should do the right thing.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


  reply	other threads:[~2009-06-01 16:56 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-20 18:43 [PATCH 0/4] amd64_edac: misc fixes Borislav Petkov
2009-05-20 18:43 ` [PATCH 1/4] x86: msr.h: fix build error Borislav Petkov
2009-05-20 18:43 ` [PATCH 2/4] amd64_edac: do not enable module by default Borislav Petkov
2009-05-20 18:43 ` [PATCH 3/4] EDAC: do not enable modules " Borislav Petkov
2009-05-20 18:43 ` [PATCH 4/4] amd64_edac: add MAINTAINERS entry Borislav Petkov
2009-05-20 21:41 ` [PATCH 0/4] amd64_edac: misc fixes Randy Dunlap
2009-05-28 23:47 ` Andrew Morton
2009-05-29 10:33   ` Borislav Petkov
2009-05-29 20:01     ` Andrew Morton
2009-05-30  8:19       ` Borislav Petkov
2009-05-30  8:40         ` Andrew Morton
2009-05-30 10:31           ` Borislav Petkov
2009-05-30 19:22           ` H. Peter Anvin
2009-06-01 14:53             ` Borislav Petkov
2009-06-01 16:54               ` H. Peter Anvin [this message]
2009-06-01 17:02                 ` H. Peter Anvin
2009-06-01 17:31                   ` H.J. Lu
2009-06-01 18:12                 ` Borislav Petkov
2009-06-01 18:57                   ` H. Peter Anvin
2009-06-03 18:20                     ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A2407D1.5050706@zytor.com \
    --to=hpa@zytor.com \
    --cc=akpm@linux-foundation.org \
    --cc=aris@redhat.com \
    --cc=borislav.petkov@amd.com \
    --cc=edt@aei.ca \
    --cc=greg@kroah.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=norsk5@yahoo.com \
    --cc=petkovbb@gmail.com \
    --cc=randy.dunlap@oracle.com \
    --cc=sam@ravnborg.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox