qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [5592] target-ppc: optimize popcntb
@ 2008-11-01  0:54 Aurelien Jarno
  2008-11-01 12:29 ` Laurent Desnogues
  0 siblings, 1 reply; 5+ messages in thread
From: Aurelien Jarno @ 2008-11-01  0:54 UTC (permalink / raw)
  To: qemu-devel

Revision: 5592
          http://svn.sv.gnu.org/viewvc/?view=rev&root=qemu&revision=5592
Author:   aurel32
Date:     2008-11-01 00:54:33 +0000 (Sat, 01 Nov 2008)

Log Message:
-----------
target-ppc: optimize popcntb

Suggested by Andrzej Zaborowski.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>

Modified Paths:
--------------
    trunk/target-ppc/op_helper.c

Modified: trunk/target-ppc/op_helper.c
===================================================================
--- trunk/target-ppc/op_helper.c	2008-11-01 00:54:23 UTC (rev 5591)
+++ trunk/target-ppc/op_helper.c	2008-11-01 00:54:33 UTC (rev 5592)
@@ -222,25 +222,19 @@
 
 target_ulong helper_popcntb (target_ulong val)
 {
-    uint32_t ret;
-    int i;
-
-    ret = 0;
-    for (i = 0; i < 32; i += 8)
-        ret |= ctpop8((val >> i) & 0xFF) << i;
-    return ret;
+    val = (val & 0x55555555) + ((val >>  1) & 0x55555555);
+    val = (val & 0x33333333) + ((val >>  2) & 0x33333333);
+    val = (val & 0x0f0f0f0f) + ((val >>  4) & 0x0f0f0f0f);
+    return val;
 }
 
 #if defined(TARGET_PPC64)
 target_ulong helper_popcntb_64 (target_ulong val)
 {
-    uint64_t ret;
-    int i;
-
-    ret = 0;
-    for (i = 0; i < 64; i += 8)
-        ret |= ctpop8((val >> i) & 0xFF) << i;
-    return ret;
+    val = (val & 0x5555555555555555ULL) + ((val >>  1) & 0x5555555555555555ULL);
+    val = (val & 0x3333333333333333ULL) + ((val >>  2) & 0x3333333333333333ULL);
+    val = (val & 0x0f0f0f0f0f0f0f0fULL) + ((val >>  4) & 0x0f0f0f0f0f0f0f0fULL);
+    return val;
 }
 #endif
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [5592] target-ppc: optimize popcntb
  2008-11-01  0:54 [Qemu-devel] [5592] target-ppc: optimize popcntb Aurelien Jarno
@ 2008-11-01 12:29 ` Laurent Desnogues
  2008-11-01 12:35   ` Laurent Desnogues
  0 siblings, 1 reply; 5+ messages in thread
From: Laurent Desnogues @ 2008-11-01 12:29 UTC (permalink / raw)
  To: qemu-devel

On Sat, Nov 1, 2008 at 1:54 AM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> Revision: 5592
>          http://svn.sv.gnu.org/viewvc/?view=rev&root=qemu&revision=5592
[...]
> Modified: trunk/target-ppc/op_helper.c
> ===================================================================
> --- trunk/target-ppc/op_helper.c        2008-11-01 00:54:23 UTC (rev 5591)
> +++ trunk/target-ppc/op_helper.c        2008-11-01 00:54:33 UTC (rev 5592)
> @@ -222,25 +222,19 @@
>
>  target_ulong helper_popcntb (target_ulong val)
>  {
> -    uint32_t ret;
> -    int i;
> -
> -    ret = 0;
> -    for (i = 0; i < 32; i += 8)
> -        ret |= ctpop8((val >> i) & 0xFF) << i;
> -    return ret;
> +    val = (val & 0x55555555) + ((val >>  1) & 0x55555555);
> +    val = (val & 0x33333333) + ((val >>  2) & 0x33333333);
> +    val = (val & 0x0f0f0f0f) + ((val >>  4) & 0x0f0f0f0f);
> +    return val;
>  }
>
>  #if defined(TARGET_PPC64)
>  target_ulong helper_popcntb_64 (target_ulong val)
>  {
> -    uint64_t ret;
> -    int i;
> -
> -    ret = 0;
> -    for (i = 0; i < 64; i += 8)
> -        ret |= ctpop8((val >> i) & 0xFF) << i;
> -    return ret;
> +    val = (val & 0x5555555555555555ULL) + ((val >>  1) & 0x5555555555555555ULL);
> +    val = (val & 0x3333333333333333ULL) + ((val >>  2) & 0x3333333333333333ULL);
> +    val = (val & 0x0f0f0f0f0f0f0f0fULL) + ((val >>  4) & 0x0f0f0f0f0f0f0f0fULL);
> +    return val;
>  }
>  #endif

Wouldn't it make sense to use builtin's as is done in host-utils.h?


Laurent

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [5592] target-ppc: optimize popcntb
  2008-11-01 12:29 ` Laurent Desnogues
@ 2008-11-01 12:35   ` Laurent Desnogues
  2008-11-01 13:57     ` andrzej zaborowski
  0 siblings, 1 reply; 5+ messages in thread
From: Laurent Desnogues @ 2008-11-01 12:35 UTC (permalink / raw)
  To: qemu-devel

On Sat, Nov 1, 2008 at 1:29 PM, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
>
> Wouldn't it make sense to use builtin's as is done in host-utils.h?

Forget that, I thought it was traditional bit counting.


Laurent

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [5592] target-ppc: optimize popcntb
  2008-11-01 12:35   ` Laurent Desnogues
@ 2008-11-01 13:57     ` andrzej zaborowski
  2008-11-01 14:34       ` Laurent Desnogues
  0 siblings, 1 reply; 5+ messages in thread
From: andrzej zaborowski @ 2008-11-01 13:57 UTC (permalink / raw)
  To: qemu-devel

2008/11/1 Laurent Desnogues <laurent.desnogues@gmail.com>:
> On Sat, Nov 1, 2008 at 1:29 PM, Laurent Desnogues
> <laurent.desnogues@gmail.com> wrote:
>>
>> Wouldn't it make sense to use builtin's as is done in host-utils.h?
>
> Forget that, I thought it was traditional bit counting.

On ppc host there might be a builtin for it, on the x86 Xeon cpus with
SSE4 there's also a bitcounting instruction but this approach is
actually faster than transferring the number to the MMX register,
running the instruction and copying the value back.  In the benchmarks
I've seen the speed is comparable with table lookup on x86 and -O3.

Cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [5592] target-ppc: optimize popcntb
  2008-11-01 13:57     ` andrzej zaborowski
@ 2008-11-01 14:34       ` Laurent Desnogues
  0 siblings, 0 replies; 5+ messages in thread
From: Laurent Desnogues @ 2008-11-01 14:34 UTC (permalink / raw)
  To: qemu-devel

On Sat, Nov 1, 2008 at 2:57 PM, andrzej zaborowski <balrogg@gmail.com> wrote:
>
> On ppc host there might be a builtin for it, on the x86 Xeon cpus with
> SSE4 there's also a bitcounting instruction but this approach is
> actually faster than transferring the number to the MMX register,
> running the instruction and copying the value back.  In the benchmarks
> I've seen the speed is comparable with table lookup on x86 and -O3.

Bit tricks are always very sensitive.  BTW I would not trust any
benchmark that run them in loop for obvious reasons :-)

For those who like that kind of thing, here are some nice refs:

- Knuth:  http://www-cs-faculty.stanford.edu/~uno/fasc1a.ps.gz
- Anderson:  http://www-graphics.stanford.edu/~seander/bithacks.html
- Arndt:  http://www.jjj.de/bitwizardry/bitwizardrypage.html

Many of these tricks are known or obvious, but it's good reading
anyway, especially for qemu target and back-end writers.


Laurent

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-11-01 14:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-01  0:54 [Qemu-devel] [5592] target-ppc: optimize popcntb Aurelien Jarno
2008-11-01 12:29 ` Laurent Desnogues
2008-11-01 12:35   ` Laurent Desnogues
2008-11-01 13:57     ` andrzej zaborowski
2008-11-01 14:34       ` Laurent Desnogues

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).