All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander van Heukelum <heukelum@mailshack.com>
To: linux-arch <linux-arch@vger.kernel.org>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>,
	Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: [RFC] non-x86: Optimize find_next_(zero_)bit for small constant-size bitmaps
Date: Tue, 11 Mar 2008 16:22:39 +0100	[thread overview]
Message-ID: <20080311152239.GA15335@mailshack.com> (raw)
In-Reply-To: <20080311151719.GA15313@mailshack.com>

non-x86: Optimize small bitmap searches for the other archs.

Concerns archs that have their own implementation of find_next_bit
and find_next_zero_bit.

For archs that define CONFIG_GENERIC_FIND_NEXT_BIT, a patch is
submitted to be included in Ingo's x86-testing tree. It moves
an optimization for searching small, constant-sized bitmaps
from x86_64-specific to generic code. It works by creating
a new __always_inline-marked function in linux/bitops.h, and
renaming find_next_(zero_)bit to __find_next_(zero_)bit in
the generic code. The new code is currently guarded by
CONFIG_GENERIC_FIND_NEXT_BIT, so other archs are expected to
work as before.

To enable the optimization globally, all other archs need to
implement __find_next_bit and __find_next_zero_bit either
by exporting these symbols, or by implementing them as inline
functions in their asm/bitops.h.

It would also be nice if every implementation would have the
same declaration:

	unsigned long __find_next_(zero_)bit(unsigned long *,
			unsigned long, unsigned long);

I added a totally untested patch for reference.

Did I mis an arch?
Does the assembly still match the (changed) prototype?
Can we get concensus on doing the optimization in generic code?

Greetings,
	Alexander

 arch/arm/lib/findbit.S          |    6 ++++--
 arch/avr32/kernel/avr32_ksyms.c |    4 ++--
 arch/avr32/lib/findbit.S        |    4 ++--
 include/asm-arm/bitops.h        |   20 ++++++++++++--------
 include/asm-m68k/bitops.h       |    8 ++++----
 include/asm-powerpc/bitops.h    |    4 ----
 include/asm-s390/bitops.h       |   12 ++++++------
 include/linux/bitops.h          |    2 --
 8 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/arm/lib/findbit.S b/arch/arm/lib/findbit.S
index a5ca024..5c92967 100644
--- a/arch/arm/lib/findbit.S
+++ b/arch/arm/lib/findbit.S
@@ -36,7 +36,8 @@ ENTRY(_find_first_zero_bit_le)
 
 /*
  * Purpose  : Find next 'zero' bit
- * Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
+ * Prototype: unsigned long find_next_zero_bit(unsigned long *addr,
+ *			unsigned long maxbit, unsigned long offset);
  */
 ENTRY(_find_next_zero_bit_le)
 		teq	r1, #0
@@ -70,7 +71,8 @@ ENTRY(_find_first_bit_le)
 
 /*
  * Purpose  : Find next 'one' bit
- * Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
+ * Prototype: unsigned long find_next_zero_bit(unsigned long *addr,
+ *			unsigned long maxbit, unsigned long offset);
  */
 ENTRY(_find_next_bit_le)
 		teq	r1, #0
diff --git a/arch/avr32/kernel/avr32_ksyms.c b/arch/avr32/kernel/avr32_ksyms.c
index 80f55f8..c228ed1 100644
--- a/arch/avr32/kernel/avr32_ksyms.c
+++ b/arch/avr32/kernel/avr32_ksyms.c
@@ -51,9 +51,9 @@ EXPORT_SYMBOL(__const_udelay);
 
 /* Bit operations (lib/findbit.S) */
 EXPORT_SYMBOL(find_first_zero_bit);
-EXPORT_SYMBOL(find_next_zero_bit);
+EXPORT_SYMBOL(__find_next_zero_bit);
 EXPORT_SYMBOL(find_first_bit);
-EXPORT_SYMBOL(find_next_bit);
+EXPORT_SYMBOL(__find_next_bit);
 EXPORT_SYMBOL(generic_find_next_zero_le_bit);
 
 /* I/O primitives (lib/io-*.S) */
diff --git a/arch/avr32/lib/findbit.S b/arch/avr32/lib/findbit.S
index c6b91de..729deab 100644
--- a/arch/avr32/lib/findbit.S
+++ b/arch/avr32/lib/findbit.S
@@ -29,7 +29,7 @@ ENTRY(find_first_zero_bit)
 	 *				    unsigned long size,
 	 *				    unsigned long offset)
 	 */
-ENTRY(find_next_zero_bit)
+ENTRY(__find_next_zero_bit)
 	lsr	r8, r10, 5
 	sub	r9, r11, r10
 	retle	r11
@@ -93,7 +93,7 @@ ENTRY(find_first_bit)
 	 *			       unsigned long size,
 	 *			       unsigned long offset)
 	 */
-ENTRY(find_next_bit)
+ENTRY(__find_next_bit)
 	lsr	r8, r10, 5
 	sub	r9, r11, r10
 	retle	r11
diff --git a/include/asm-arm/bitops.h b/include/asm-arm/bitops.h
index 5c60bfc..08b3163 100644
--- a/include/asm-arm/bitops.h
+++ b/include/asm-arm/bitops.h
@@ -158,9 +158,11 @@ extern int _test_and_set_bit_le(int nr, volatile unsigned long * p);
 extern int _test_and_clear_bit_le(int nr, volatile unsigned long * p);
 extern int _test_and_change_bit_le(int nr, volatile unsigned long * p);
 extern int _find_first_zero_bit_le(const void * p, unsigned size);
-extern int _find_next_zero_bit_le(const void * p, int size, int offset);
+extern unsigned long _find_next_zero_bit_le(const unsigned long * p,
+		unsigned long size, unsigned long offset);
 extern int _find_first_bit_le(const unsigned long *p, unsigned size);
-extern int _find_next_bit_le(const unsigned long *p, int size, int offset);
+extern unsigned long _find_next_bit_le(const unsigned long *p,
+		unsigned long size, unsigned long offset);
 
 /*
  * Big endian assembly bitops.  nr = 0 -> byte 3 bit 0.
@@ -172,9 +174,11 @@ extern int _test_and_set_bit_be(int nr, volatile unsigned long * p);
 extern int _test_and_clear_bit_be(int nr, volatile unsigned long * p);
 extern int _test_and_change_bit_be(int nr, volatile unsigned long * p);
 extern int _find_first_zero_bit_be(const void * p, unsigned size);
-extern int _find_next_zero_bit_be(const void * p, int size, int offset);
+extern unsigned long _find_next_zero_bit_be(const unsigned long * p,
+		unsigned long size, unsigned long offset);
 extern int _find_first_bit_be(const unsigned long *p, unsigned size);
-extern int _find_next_bit_be(const unsigned long *p, int size, int offset);
+extern unsigned long _find_next_bit_be(const unsigned long *p,
+		unsigned long size, unsigned long offset);
 
 #ifndef CONFIG_SMP
 /*
@@ -208,9 +212,9 @@ extern int _find_next_bit_be(const unsigned long *p, int size, int offset);
 #define test_and_clear_bit(nr,p)	ATOMIC_BITOP_LE(test_and_clear_bit,nr,p)
 #define test_and_change_bit(nr,p)	ATOMIC_BITOP_LE(test_and_change_bit,nr,p)
 #define find_first_zero_bit(p,sz)	_find_first_zero_bit_le(p,sz)
-#define find_next_zero_bit(p,sz,off)	_find_next_zero_bit_le(p,sz,off)
+#define __find_next_zero_bit(p,sz,off)	_find_next_zero_bit_le(p,sz,off)
 #define find_first_bit(p,sz)		_find_first_bit_le(p,sz)
-#define find_next_bit(p,sz,off)		_find_next_bit_le(p,sz,off)
+#define __find_next_bit(p,sz,off)	_find_next_bit_le(p,sz,off)
 
 #define WORD_BITOFF_TO_LE(x)		((x))
 
@@ -226,9 +230,9 @@ extern int _find_next_bit_be(const unsigned long *p, int size, int offset);
 #define test_and_clear_bit(nr,p)	ATOMIC_BITOP_BE(test_and_clear_bit,nr,p)
 #define test_and_change_bit(nr,p)	ATOMIC_BITOP_BE(test_and_change_bit,nr,p)
 #define find_first_zero_bit(p,sz)	_find_first_zero_bit_be(p,sz)
-#define find_next_zero_bit(p,sz,off)	_find_next_zero_bit_be(p,sz,off)
+#define __find_next_zero_bit(p,sz,off)	_find_next_zero_bit_be(p,sz,off)
 #define find_first_bit(p,sz)		_find_first_bit_be(p,sz)
-#define find_next_bit(p,sz,off)		_find_next_bit_be(p,sz,off)
+#define __find_next_bit(p,sz,off)	_find_next_bit_be(p,sz,off)
 
 #define WORD_BITOFF_TO_LE(x)		((x) ^ 0x18)
 
diff --git a/include/asm-m68k/bitops.h b/include/asm-m68k/bitops.h
index 83d1f28..c320c7a 100644
--- a/include/asm-m68k/bitops.h
+++ b/include/asm-m68k/bitops.h
@@ -199,8 +199,8 @@ out:
 	return ((long)p - (long)vaddr - 4) * 8 + res;
 }
 
-static inline int find_next_zero_bit(const unsigned long *vaddr, int size,
-				     int offset)
+static inline unsigned long __find_next_zero_bit(const unsigned long *vaddr,
+		unsigned long size, unsigned long offset)
 {
 	const unsigned long *p = vaddr + (offset >> 5);
 	int bit = offset & 31UL, res;
@@ -246,8 +246,8 @@ out:
 	return ((long)p - (long)vaddr - 4) * 8 + res;
 }
 
-static inline int find_next_bit(const unsigned long *vaddr, int size,
-				int offset)
+static inline unsigned long __find_next_bit(const unsigned long *vaddr,
+		unsigned long size, unsigned long offset)
 {
 	const unsigned long *p = vaddr + (offset >> 5);
 	int bit = offset & 31UL, res;
diff --git a/include/asm-powerpc/bitops.h b/include/asm-powerpc/bitops.h
index 220d9a7..8ae6667 100644
--- a/include/asm-powerpc/bitops.h
+++ b/include/asm-powerpc/bitops.h
@@ -317,8 +317,6 @@ static __inline__ int fls(unsigned int x)
 #include <asm-generic/bitops/hweight.h>
 
 #define find_first_zero_bit(addr, size) find_next_zero_bit((addr), (size), 0)
-unsigned long find_next_zero_bit(const unsigned long *addr,
-				 unsigned long size, unsigned long offset);
 /**
  * find_first_bit - find the first set bit in a memory region
  * @addr: The address to start the search at
@@ -328,8 +326,6 @@ unsigned long find_next_zero_bit(const unsigned long *addr,
  * containing a bit.
  */
 #define find_first_bit(addr, size) find_next_bit((addr), (size), 0)
-unsigned long find_next_bit(const unsigned long *addr,
-			    unsigned long size, unsigned long offset);
 
 /* Little-endian versions */
 
diff --git a/include/asm-s390/bitops.h b/include/asm-s390/bitops.h
index 965394e..13a5c33 100644
--- a/include/asm-s390/bitops.h
+++ b/include/asm-s390/bitops.h
@@ -691,9 +691,9 @@ static inline unsigned long find_first_bit(const unsigned long * addr,
  * @offset: The bitnumber to start searching at
  * @size: The maximum size to search
  */
-static inline int find_next_zero_bit (const unsigned long * addr,
-				      unsigned long size,
-				      unsigned long offset)
+static inline unsigned long __find_next_zero_bit(const unsigned long * addr,
+						 unsigned long size,
+						 unsigned long offset)
 {
         const unsigned long *p;
 	unsigned long bit, set;
@@ -727,9 +727,9 @@ static inline int find_next_zero_bit (const unsigned long * addr,
  * @offset: The bitnumber to start searching at
  * @size: The maximum size to search
  */
-static inline int find_next_bit (const unsigned long * addr,
-				 unsigned long size,
-				 unsigned long offset)
+static inline unsigned long __find_next_bit(const unsigned long * addr,
+					    unsigned long size,
+					    unsigned long offset)
 {
         const unsigned long *p;
 	unsigned long bit, set;
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 15360f9..68be268 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -73,7 +73,6 @@ static inline unsigned fls_long(unsigned long l)
 }
 
 #ifdef __KERNEL__
-#ifdef CONFIG_GENERIC_FIND_NEXT_BIT
 extern unsigned long __find_next_bit(const unsigned long *addr,
 		unsigned long size, unsigned long offset);
 
@@ -147,6 +146,5 @@ find_next_zero_bit(const unsigned long *addr, unsigned long size,
 	/* size is not constant or too big */
 	return __find_next_zero_bit(addr, size, offset);
 }
-#endif /* CONFIG_GENERIC_FIND_NEXT_BIT */
 #endif /* __KERNEL__ */
 #endif

  reply	other threads:[~2008-03-11 15:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-09 20:01 [PATCH] x86: Change x86 to use generic find_next_bit Alexander van Heukelum
2008-03-09 20:10 ` Ingo Molnar
2008-03-09 21:03   ` Andi Kleen
2008-03-09 21:32     ` Andi Kleen
2008-03-09 21:13   ` Alexander van Heukelum
2008-03-10  6:29     ` Ingo Molnar
2008-03-09 20:11 ` Ingo Molnar
2008-03-09 20:31   ` Alexander van Heukelum
2008-03-09 20:51     ` Ingo Molnar
2008-03-09 21:29       ` Andi Kleen
2008-03-10 23:17       ` [RFC/PATCH] x86: Optimize find_next_(zero_)bit for small constant-size bitmaps Alexander van Heukelum
2008-03-11  9:56         ` Ingo Molnar
2008-03-11 15:17           ` [PATCH] " Alexander van Heukelum
2008-03-11 15:22             ` Alexander van Heukelum [this message]
2008-03-11 15:23             ` Ingo Molnar
2008-03-09 20:28 ` [PATCH] x86: Change x86 to use generic find_next_bit Andi Kleen
2008-03-09 21:31 ` Andi Kleen
2008-03-13 12:44 ` Aneesh Kumar K.V
2008-03-13 14:27   ` Alexander van Heukelum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080311152239.GA15335@mailshack.com \
    --to=heukelum@mailshack.com \
    --cc=andi@firstfloor.org \
    --cc=heukelum@fastmail.fm \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.