public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] page: get_order() optimization
@ 2011-04-01 19:18 Maksym Planeta
  2011-04-01 19:34 ` H. Peter Anvin
  2011-04-01 19:40 ` Christophe JAILLET
  0 siblings, 2 replies; 4+ messages in thread
From: Maksym Planeta @ 2011-04-01 19:18 UTC (permalink / raw)
  To: mingo; +Cc: kernel-janitors, linux-kernel, hpa, Maksym Planeta

Loop was repalaced with __builtin_clz(). This still allows to precompute
constants, but on some architectures it uses special instruction to
calculate order.

Signed-off-by: Maksym Planeta <mcsim.planeta@gmail.com>
---
 include/asm-generic/getorder.h |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
index 67e7245..fe8020c 100644
--- a/include/asm-generic/getorder.h
+++ b/include/asm-generic/getorder.h
@@ -11,11 +11,9 @@ static inline __attribute_const__ int get_order(unsigned long size)
 	int order;
 
 	size = (size - 1) >> (PAGE_SHIFT - 1);
-	order = -1;
-	do {
-		size >>= 1;
-		order++;
-	} while (size);
+	order = (__builtin_clzl(size) ^ (BITS_PER_LONG - 1));
+	if (size == 0)
+		order = 0;
 	return order;
 }
 
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] page: get_order() optimization
  2011-04-01 19:18 [PATCH] page: get_order() optimization Maksym Planeta
@ 2011-04-01 19:34 ` H. Peter Anvin
  2011-04-02 13:06   ` Johannes Weiner
  2011-04-01 19:40 ` Christophe JAILLET
  1 sibling, 1 reply; 4+ messages in thread
From: H. Peter Anvin @ 2011-04-01 19:34 UTC (permalink / raw)
  To: Maksym Planeta; +Cc: mingo, kernel-janitors, linux-kernel

On 04/01/2011 12:18 PM, Maksym Planeta wrote:
> Loop was repalaced with __builtin_clz(). This still allows to precompute
> constants, but on some architectures it uses special instruction to
> calculate order.
> 
> Signed-off-by: Maksym Planeta <mcsim.planeta@gmail.com>
> ---
>  include/asm-generic/getorder.h |    8 +++-----
>  1 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
> index 67e7245..fe8020c 100644
> --- a/include/asm-generic/getorder.h
> +++ b/include/asm-generic/getorder.h
> @@ -11,11 +11,9 @@ static inline __attribute_const__ int get_order(unsigned long size)
>  	int order;
>  
>  	size = (size - 1) >> (PAGE_SHIFT - 1);
> -	order = -1;
> -	do {
> -		size >>= 1;
> -		order++;
> -	} while (size);
> +	order = (__builtin_clzl(size) ^ (BITS_PER_LONG - 1));
> +	if (size == 0)
> +		order = 0;
>  	return order;
>  }
>  

You need to guard this with __GNUC__ >= 4; there are still laggards
using gcc 3.  Furthermore, on some platforms __builtin_clz*() does a
libgcc call which may be undesirable.

For the generic case, one can do something like this instead of a loop:

static inline unsigned int __clzl(unsigned long v)
{
        unsigned int p;

#if BITS_PER_LONG == 64
	p = 63;

	if (v & 0xffffffff00000000UL) {
		p -= 32;
		v >>= 32;
	}
#else
	p = 31;
#endif

        if (v & 0xffff0000) {
                p -= 16;
                v >>= 16;
        }
        if (v & 0xff00) {
                p -= 8;
                v >>= 8;
        }
        if (v & 0xf0) {
                p -= 4;
                v >>= 4;
        }
        if (v & 0xc) {
                p -= 2;
                v >>= 2;
        }
        if (v & 0x2) {
                p -= 1;
                v >>= 1;
        }

        return p;
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] page: get_order() optimization
  2011-04-01 19:18 [PATCH] page: get_order() optimization Maksym Planeta
  2011-04-01 19:34 ` H. Peter Anvin
@ 2011-04-01 19:40 ` Christophe JAILLET
  1 sibling, 0 replies; 4+ messages in thread
From: Christophe JAILLET @ 2011-04-01 19:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-janitors

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1242 bytes --]

Hi,

why not have something like :
> + order = 0;
> + if (size != 0)
> + order = (__builtin_clzl(size) ^ (BITS_PER_LONG - 1));

we avoid the "complex" computation when not necessary.
No ?

Best regards,
Christophe JAILLET

"Maksym Planeta" <mcsim.planeta@gmail.com> a écrit dans le message de
news:1301685493-2567-1-git-send-email-mcsim.planeta@gmail.com...
> Loop was repalaced with __builtin_clz(). This still allows to precompute
> constants, but on some architectures it uses special instruction to
> calculate order.
>
> Signed-off-by: Maksym Planeta <mcsim.planeta@gmail.com>
> ---
>  include/asm-generic/getorder.h |    8 +++-----
>  1 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/include/asm-generic/getorder.h
b/include/asm-generic/getorder.h
> index 67e7245..fe8020c 100644
> --- a/include/asm-generic/getorder.h
> +++ b/include/asm-generic/getorder.h
> @@ -11,11 +11,9 @@ static inline __attribute_const__ int
get_order(unsigned long size)
>   int order;
>
>   size = (size - 1) >> (PAGE_SHIFT - 1);
> - order = -1;
> - do {
> - size >>= 1;
> - order++;
> - } while (size);
> + order = (__builtin_clzl(size) ^ (BITS_PER_LONG - 1));
> + if (size == 0)
> + order = 0;
>   return order;
>  }
>
> -- 
> 1.7.2.3
>




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] page: get_order() optimization
  2011-04-01 19:34 ` H. Peter Anvin
@ 2011-04-02 13:06   ` Johannes Weiner
  0 siblings, 0 replies; 4+ messages in thread
From: Johannes Weiner @ 2011-04-02 13:06 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Maksym Planeta, mingo, kernel-janitors, linux-kernel

On Fri, Apr 01, 2011 at 12:34:32PM -0700, H. Peter Anvin wrote:
> On 04/01/2011 12:18 PM, Maksym Planeta wrote:
> > Loop was repalaced with __builtin_clz(). This still allows to precompute
> > constants, but on some architectures it uses special instruction to
> > calculate order.
> > 
> > Signed-off-by: Maksym Planeta <mcsim.planeta@gmail.com>
> > ---
> >  include/asm-generic/getorder.h |    8 +++-----
> >  1 files changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
> > index 67e7245..fe8020c 100644
> > --- a/include/asm-generic/getorder.h
> > +++ b/include/asm-generic/getorder.h
> > @@ -11,11 +11,9 @@ static inline __attribute_const__ int get_order(unsigned long size)
> >  	int order;
> >  
> >  	size = (size - 1) >> (PAGE_SHIFT - 1);
> > -	order = -1;
> > -	do {
> > -		size >>= 1;
> > -		order++;
> > -	} while (size);
> > +	order = (__builtin_clzl(size) ^ (BITS_PER_LONG - 1));
> > +	if (size == 0)
> > +		order = 0;
> >  	return order;
> >  }
> >  
> 
> You need to guard this with __GNUC__ >= 4; there are still laggards
> using gcc 3.  Furthermore, on some platforms __builtin_clz*() does a
> libgcc call which may be undesirable.
> 
> For the generic case, one can do something like this instead of a loop:

It looks odd to me to count from the left and then subtract to get the
offset of the msb from the right.  Can't we just use fls() here, which
is already made available in both generic and arch-optimized versions?

	pages = size - 1 >> PAGE_SHIFT;
	if (pages)
		return __fls(pages) + 1;
	return 0;

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-04-02 13:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-01 19:18 [PATCH] page: get_order() optimization Maksym Planeta
2011-04-01 19:34 ` H. Peter Anvin
2011-04-02 13:06   ` Johannes Weiner
2011-04-01 19:40 ` Christophe JAILLET

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox