From: David Howells <dhowells@redhat.com>
To: linux-arch@vger.kernel.org
Cc: dhowells@redhat.com
Subject: [PATCH 1/3] X86: Optimise fls(), ffs() and fls64()
Date: Wed, 13 Jan 2010 19:39:23 +0000 [thread overview]
Message-ID: <20100113193923.22272.87546.stgit@warthog.procyon.org.uk> (raw)
fls(N), ffs(N) and fls64(N) can be optimised on x86/x86_64. Currently they
perform checks against N being 0 before invoking the BSR/BSF instruction, or
use a CMOV instruction afterwards. Either the check involves a conditional
jump which we'd like to avoid, or a CMOV, which we'd also quite like to avoid.
Instead, we can make use of the fact that BSR/BSF doesn't modify its output
register if its input is 0. By preloading the output with -1 and incrementing
the result, we achieve the desired result without the need for a conditional
check.
The 32-bit version of fls64() can also have all its conditional checks removed
by chaining BSRs with a subtraction in the middle. However, this may be
suboptimal on CPUs for which BSR/BSF is really slow (i486 and below for
example).
Signed-off-by: David Howells <dhowells@redhat.com>
---
arch/x86/include/asm/bitops.h | 36 ++++++++++++------------------------
include/asm-generic/bitops/fls64.h | 21 +++++++++++++++------
2 files changed, 27 insertions(+), 30 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 02b47a6..112d623 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -394,18 +394,12 @@ static inline unsigned long __fls(unsigned long word)
*/
static inline int ffs(int x)
{
- int r;
-#ifdef CONFIG_X86_CMOV
- asm("bsfl %1,%0\n\t"
- "cmovzl %2,%0"
- : "=r" (r) : "rm" (x), "r" (-1));
-#else
- asm("bsfl %1,%0\n\t"
- "jnz 1f\n\t"
- "movl $-1,%0\n"
- "1:" : "=r" (r) : "rm" (x));
-#endif
- return r + 1;
+ int bitpos = -1;
+
+ asm("bsfl %1,%0"
+ : "+r" (bitpos)
+ : "rm" (x));
+ return bitpos + 1;
}
/**
@@ -421,18 +415,12 @@ static inline int ffs(int x)
*/
static inline int fls(int x)
{
- int r;
-#ifdef CONFIG_X86_CMOV
- asm("bsrl %1,%0\n\t"
- "cmovzl %2,%0"
- : "=&r" (r) : "rm" (x), "rm" (-1));
-#else
- asm("bsrl %1,%0\n\t"
- "jnz 1f\n\t"
- "movl $-1,%0\n"
- "1:" : "=r" (r) : "rm" (x));
-#endif
- return r + 1;
+ int bitpos = -1;
+
+ asm("bsrl %1,%0"
+ : "+r" (bitpos)
+ : "rm" (x));
+ return bitpos + 1;
}
#endif /* __KERNEL__ */
diff --git a/include/asm-generic/bitops/fls64.h b/include/asm-generic/bitops/fls64.h
index b097cf8..f81e04c 100644
--- a/include/asm-generic/bitops/fls64.h
+++ b/include/asm-generic/bitops/fls64.h
@@ -18,16 +18,25 @@
static __always_inline int fls64(__u64 x)
{
__u32 h = x >> 32;
- if (h)
- return fls(h) + 32;
- return fls(x);
+ int bitpos = -1;
+
+ asm("bsrl %1,%0 \n"
+ "subl %2,%0 \n"
+ "bsrl %3,%0 \n"
+ : "+r" (bitpos)
+ : "rm" (x), "i"(32), "rm" (h));
+
+ return bitpos + 33;
}
#elif BITS_PER_LONG == 64
static __always_inline int fls64(__u64 x)
{
- if (x == 0)
- return 0;
- return __fls(x) + 1;
+ long bitpos = -1;
+
+ asm("bsrq %1,%0"
+ : "+r" (bitpos)
+ : "rm" (x));
+ return bitpos + 1;
}
#else
#error BITS_PER_LONG not 32 or 64
next reply other threads:[~2010-01-13 19:39 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-13 19:39 David Howells [this message]
2010-01-13 19:39 ` [PATCH 2/3] Adjust the comment on get_order() to describe the size==0 case David Howells
2010-01-13 19:39 ` [PATCH 3/3] Optimise get_order() David Howells
2010-01-13 20:15 ` [PATCH 1/3] X86: Optimise fls(), ffs() and fls64() Geert Uytterhoeven
2010-01-13 21:59 ` David Howells
[not found] <4BACCB4E.7010108@draigBrady.com>
2010-03-26 14:42 ` David Howells
2010-03-26 17:23 ` Linus Torvalds
2010-03-26 17:37 ` Scott Lurndal
2010-03-26 17:42 ` Linus Torvalds
2010-04-06 13:57 ` Jamie Lokier
2010-04-06 14:40 ` Linus Torvalds
2010-03-26 17:42 ` David Howells
2010-03-26 17:45 ` Linus Torvalds
2010-03-26 17:58 ` Ralf Baechle
2010-03-26 18:03 ` Linus Torvalds
2010-03-26 18:16 ` Matthew Wilcox
2010-04-06 13:30 ` Matthew Wilcox
2010-04-14 11:49 ` David Howells
2010-04-14 14:30 ` Avi Kivity
2010-04-15 8:48 ` David Howells
2010-04-15 8:49 ` Avi Kivity
2010-04-15 11:41 ` Jamie Lokier
2010-03-26 17:52 ` Matthew Wilcox
2010-04-14 13:13 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100113193923.22272.87546.stgit@warthog.procyon.org.uk \
--to=dhowells@redhat.com \
--cc=linux-arch@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).