All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alexander van Heukelum" <heukelum@fastmail.fm>
To: "dean gaudet" <dean@arctic.org>,
	"Alexander van Heukelum" <heukelum@mailshack.com>
Cc: "Ingo Molnar" <mingo@elte.hu>, "Andi Kleen" <andi@firstfloor.org>,
	"LKML" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: generic versions of find_first_(zero_)bit, convert   i386
Date: Sun, 06 Apr 2008 20:51:37 +0200	[thread overview]
Message-ID: <1207507897.18129.1246358115@webmail.messagingengine.com> (raw)
In-Reply-To: <alpine.DEB.1.10.0804060958580.7751@twinlark.arctic.org>

On Sun, 6 Apr 2008 10:03:43 -0700 (PDT), "dean gaudet" <dean@arctic.org>
said:
> fwiw there's a way to do ffz / ntz which can do lg(n) conditional moves in 
> parallel... i'm not sure what (non-x86) architectures this might be best 
> on, but it might be a good choice for the generic code... although maybe 
> the large number of constants required will be a burden on RISC 
> processors.

Hello Dean,

The current generic implementation of ffz is O(lg(n)) already, but
the version you suggest might indeed be a bit faster if the compiler
recognises that is can use conditional moves and the architecture
can handle large constants efficiently.

On the other had, the bit-search functions tend to be avoided as
much as possible, because they are often not implemented as a
hardware instruction and even if they are implemented in hardware,
they might be slow. The generic version is slow anyhow. That's
why the bitmap searches first test if a word in the bitmap is
all-0-bits/all-1-bits. The single-word version of ffz might even
be better off if it was optimized for size instead of being fully
unrolled!

> take a look at figure 5-17 here http://hackersdelight.org/revisions.pdf
> 
> int ntz(unsigned x) {
> 	unsigned y, bz, b4, b3, b2, b1, b0;
> 	y = x & -x; // Isolate rightmost 1-bit.
> 	bz = y ? 0 : 1; // 1 if y = 0.
> 	b4 = (y & 0x0000FFFF) ? 0 : 16;
> 	b3 = (y & 0x00FF00FF) ? 0 : 8;
> 	b2 = (y & 0x0F0F0F0F) ? 0 : 4;
> 	b1 = (y & 0x33333333) ? 0 : 2;
> 	b0 = (y & 0x55555555) ? 0 : 1;
> 	return bz + b4 + b3 + b2 + b1 + b0;
> }

Note: mask32 = ~0ul; mask16 = mask32 ^ (mask32 << 16), mask8 = ...

Greetings,
    Alexander
> 
> -dean
-- 
  Alexander van Heukelum
  heukelum@fastmail.fm

-- 
http://www.fastmail.fm - Or how I learned to stop worrying and
                          love email again


  reply	other threads:[~2008-04-06 18:51 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-31 17:15 [PATCH] x86: generic versions of find_first_(zero_)bit, convert i386 Alexander van Heukelum
2008-03-31 17:22 ` Stephen Hemminger
2008-03-31 19:38   ` Alexander van Heukelum
2008-03-31 21:58     ` Andi Kleen
2008-04-01  8:47 ` Ingo Molnar
2008-04-01  9:46   ` Alexander van Heukelum
2008-04-01 15:41     ` [PATCH] x86: switch x86_64 to generic find_first_bit Alexander van Heukelum
2008-04-01 15:42       ` [PATCH] x86: optimize find_first_bit for small bitmaps Alexander van Heukelum
2008-04-01 15:47         ` [PATCH] x86: remove x86-specific implementations of find_first_bit Alexander van Heukelum
2008-04-03  9:34           ` Alexander van Heukelum
2008-04-04  8:47           ` Ingo Molnar
2008-04-06 17:03     ` [PATCH] x86: generic versions of find_first_(zero_)bit, convert i386 dean gaudet
2008-04-06 18:51       ` Alexander van Heukelum [this message]
2008-04-06 20:22         ` dean gaudet
2008-04-07  8:43           ` Ingo Molnar
2008-04-07 10:25           ` Alexander van Heukelum
2008-04-18 20:18             ` Alternative implementation of the generic __ffs Alexander van Heukelum
2008-04-18 23:46               ` dean gaudet
2008-04-19  0:09                 ` Harvey Harrison
2008-04-19  0:20                   ` dean gaudet
2008-04-19  0:58                     ` Joe Perches
2008-04-19  1:04                       ` Harvey Harrison
2008-04-19  1:11                         ` dean gaudet
2008-04-19  2:55                           ` Joe Perches
2008-04-19  4:13                             ` dean gaudet
2008-04-19 10:05                               ` Mikael Pettersson
2008-04-19 12:10                               ` Alexander van Heukelum
2008-04-19 18:17                                 ` Joe Perches
2008-04-19 20:26                                   ` Alexander van Heukelum
2008-04-19 22:29                             ` Matti Aarnio
2008-04-20  3:06                               ` Joe Perches
2008-04-20  8:42                                 ` Alexander van Heukelum
2008-04-20 12:31                                   ` Matti Aarnio
2008-04-21 11:43                                     ` Alexander van Heukelum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1207507897.18129.1246358115@webmail.messagingengine.com \
    --to=heukelum@fastmail.fm \
    --cc=andi@firstfloor.org \
    --cc=dean@arctic.org \
    --cc=heukelum@mailshack.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.