public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PACH] smp: uninline num_online_cpus() & num_possible_cpus()
@ 2008-12-05 17:33 Eric Dumazet
  2008-12-08 22:43 ` Andrew Morton
  2008-12-09  4:03 ` Rusty Russell
  0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2008-12-05 17:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux kernel

num_online_cpus() and num_possible_cpus() are not performance
critical and are quite large.

Unlining them shrinks kernel text size by 7523 bytes on x86,
if NR_CPUS>32

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 include/linux/cpumask.h |    4 ++--
 init/main.c             |   12 ++++++++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 21e1dd4..f9b2b51 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -507,8 +507,8 @@ extern cpumask_t cpu_present_map;
 extern cpumask_t cpu_active_map;
 
 #if NR_CPUS > 1
-#define num_online_cpus()	cpus_weight_nr(cpu_online_map)
-#define num_possible_cpus()	cpus_weight_nr(cpu_possible_map)
+extern int num_online_cpus(void);
+extern int num_possible_cpus(void);
 #define num_present_cpus()	cpus_weight_nr(cpu_present_map)
 #define cpu_online(cpu)		cpu_isset((cpu), cpu_online_map)
 #define cpu_possible(cpu)	cpu_isset((cpu), cpu_possible_map)
diff --git a/init/main.c b/init/main.c
index 7e117a2..a1a3e55 100644
--- a/init/main.c
+++ b/init/main.c
@@ -376,6 +376,18 @@ EXPORT_SYMBOL(cpu_mask_all);
 int nr_cpu_ids __read_mostly = NR_CPUS;
 EXPORT_SYMBOL(nr_cpu_ids);
 
+int num_online_cpus(void)
+{
+	return cpus_weight_nr(cpu_online_map);
+}
+EXPORT_SYMBOL(num_online_cpus);
+
+int num_possible_cpus(void)
+{
+	return cpus_weight_nr(cpu_possible_map);
+}
+EXPORT_SYMBOL(num_possible_cpus);
+
 /* An arch may set nr_cpu_ids earlier if needed, so this would be redundant */
 static void __init setup_nr_cpu_ids(void)
 {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PACH] smp: uninline num_online_cpus() & num_possible_cpus()
  2008-12-05 17:33 [PACH] smp: uninline num_online_cpus() & num_possible_cpus() Eric Dumazet
@ 2008-12-08 22:43 ` Andrew Morton
  2008-12-09  4:03 ` Rusty Russell
  1 sibling, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2008-12-08 22:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel

On Fri, 05 Dec 2008 18:33:44 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:

> num_online_cpus() and num_possible_cpus() are not performance
> critical and are quite large.
> 
> Unlining them shrinks kernel text size by 7523 bytes on x86,
> if NR_CPUS>32
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
>  include/linux/cpumask.h |    4 ++--
>  init/main.c             |   12 ++++++++++++
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 21e1dd4..f9b2b51 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -507,8 +507,8 @@ extern cpumask_t cpu_present_map;
>  extern cpumask_t cpu_active_map;
>  
>  #if NR_CPUS > 1
> -#define num_online_cpus()	cpus_weight_nr(cpu_online_map)
> -#define num_possible_cpus()	cpus_weight_nr(cpu_possible_map)
> +extern int num_online_cpus(void);
> +extern int num_possible_cpus(void);
>  #define num_present_cpus()	cpus_weight_nr(cpu_present_map)
>  #define cpu_online(cpu)		cpu_isset((cpu), cpu_online_map)
>  #define cpu_possible(cpu)	cpu_isset((cpu), cpu_possible_map)
> diff --git a/init/main.c b/init/main.c
> index 7e117a2..a1a3e55 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -376,6 +376,18 @@ EXPORT_SYMBOL(cpu_mask_all);
>  int nr_cpu_ids __read_mostly = NR_CPUS;
>  EXPORT_SYMBOL(nr_cpu_ids);
>  
> +int num_online_cpus(void)
> +{
> +	return cpus_weight_nr(cpu_online_map);
> +}
> +EXPORT_SYMBOL(num_online_cpus);
> +
> +int num_possible_cpus(void)
> +{
> +	return cpus_weight_nr(cpu_possible_map);
> +}
> +EXPORT_SYMBOL(num_possible_cpus);
> +
>  /* An arch may set nr_cpu_ids earlier if needed, so this would be redundant */
>  static void __init setup_nr_cpu_ids(void)
>  {
	
Looks OK.

That area in init/main.c is horrid - it took quite some staring through
the ifdef tangle for me to convince myself that the code you added was
reliably SMP-only.

Perhaps sometime a lot of this cpu masky code should be moved over to
kernel/cpu.c and cleaned up.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PACH] smp: uninline num_online_cpus() & num_possible_cpus()
  2008-12-05 17:33 [PACH] smp: uninline num_online_cpus() & num_possible_cpus() Eric Dumazet
  2008-12-08 22:43 ` Andrew Morton
@ 2008-12-09  4:03 ` Rusty Russell
  2008-12-09  6:36   ` Eric Dumazet
  1 sibling, 1 reply; 4+ messages in thread
From: Rusty Russell @ 2008-12-09  4:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andrew Morton, linux kernel

On Saturday 06 December 2008 04:03:44 Eric Dumazet wrote:
> num_online_cpus() and num_possible_cpus() are not performance
> critical and are quite large.
> 
> Unlining them shrinks kernel text size by 7523 bytes on x86,
> if NR_CPUS>32

Hi Eric!

  Slight misdiagnosis, I think.  One base problem is addressed in fixing
the bitmap operators (see "[PATCH] bitmap: test for constant as well as
small size for inline versions" on lkml Message-Id: <200811160907.07140.rusty@rustcorp.com.au>).  This is already in
linux-next, and I've pasted it below.

  Worse, you used the obsolete cpumask operators :)

Thanks!
Rusty.

bitmap: test for constant as well as small size for inline versions

bitmap_zero et al have a fastpath for nbits <= BITS_PER_LONG, but this
should really only apply where the nbits is known at compile time.

This only saves about 1200 bytes on an allyesconfig kernel, but with
cpumasks going variable that number will increase.

   text		data	bss	dec		hex	filename
35327852        5035607 6782976 47146435        2cf65c3 vmlinux-before
35326640        5035607 6782976 47145223        2cf6107 vmlinux-after

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 include/linux/bitmap.h |   35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff -r b4540ad329c1 include/linux/bitmap.h
--- a/include/linux/bitmap.h	Thu Nov 06 13:00:51 2008 +1100
+++ b/include/linux/bitmap.h	Thu Nov 06 14:34:07 2008 +1100
@@ -137,9 +137,12 @@ extern void bitmap_copy_le(void *dst, co
 		(1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL		\
 )
 
+#define small_const_nbits(nbits) \
+	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
+
 static inline void bitmap_zero(unsigned long *dst, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = 0UL;
 	else {
 		int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
@@ -150,7 +153,7 @@ static inline void bitmap_fill(unsigned 
 static inline void bitmap_fill(unsigned long *dst, int nbits)
 {
 	size_t nlongs = BITS_TO_LONGS(nbits);
-	if (nlongs > 1) {
+	if (!small_const_nbits(nbits)) {
 		int len = (nlongs - 1) * sizeof(unsigned long);
 		memset(dst, 0xff,  len);
 	}
@@ -160,7 +163,7 @@ static inline void bitmap_copy(unsigned 
 static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
 			int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = *src;
 	else {
 		int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
@@ -171,7 +174,7 @@ static inline void bitmap_and(unsigned l
 static inline void bitmap_and(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = *src1 & *src2;
 	else
 		__bitmap_and(dst, src1, src2, nbits);
@@ -180,7 +183,7 @@ static inline void bitmap_or(unsigned lo
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = *src1 | *src2;
 	else
 		__bitmap_or(dst, src1, src2, nbits);
@@ -189,7 +192,7 @@ static inline void bitmap_xor(unsigned l
 static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = *src1 ^ *src2;
 	else
 		__bitmap_xor(dst, src1, src2, nbits);
@@ -198,7 +201,7 @@ static inline void bitmap_andnot(unsigne
 static inline void bitmap_andnot(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = *src1 & ~(*src2);
 	else
 		__bitmap_andnot(dst, src1, src2, nbits);
@@ -207,7 +210,7 @@ static inline void bitmap_complement(uns
 static inline void bitmap_complement(unsigned long *dst, const unsigned long *src,
 			int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = ~(*src) & BITMAP_LAST_WORD_MASK(nbits);
 	else
 		__bitmap_complement(dst, src, nbits);
@@ -216,7 +219,7 @@ static inline int bitmap_equal(const uns
 static inline int bitmap_equal(const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
 	else
 		return __bitmap_equal(src1, src2, nbits);
@@ -225,7 +228,7 @@ static inline int bitmap_intersects(cons
 static inline int bitmap_intersects(const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
 	else
 		return __bitmap_intersects(src1, src2, nbits);
@@ -234,7 +237,7 @@ static inline int bitmap_subset(const un
 static inline int bitmap_subset(const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		return ! ((*src1 & ~(*src2)) & BITMAP_LAST_WORD_MASK(nbits));
 	else
 		return __bitmap_subset(src1, src2, nbits);
@@ -242,7 +245,7 @@ static inline int bitmap_subset(const un
 
 static inline int bitmap_empty(const unsigned long *src, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
 	else
 		return __bitmap_empty(src, nbits);
@@ -250,7 +253,7 @@ static inline int bitmap_empty(const uns
 
 static inline int bitmap_full(const unsigned long *src, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
 	else
 		return __bitmap_full(src, nbits);
@@ -258,7 +261,7 @@ static inline int bitmap_full(const unsi
 
 static inline int bitmap_weight(const unsigned long *src, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
 	return __bitmap_weight(src, nbits);
 }
@@ -266,7 +269,7 @@ static inline void bitmap_shift_right(un
 static inline void bitmap_shift_right(unsigned long *dst,
 			const unsigned long *src, int n, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = *src >> n;
 	else
 		__bitmap_shift_right(dst, src, n, nbits);
@@ -275,7 +278,7 @@ static inline void bitmap_shift_left(uns
 static inline void bitmap_shift_left(unsigned long *dst,
 			const unsigned long *src, int n, int nbits)
 {
-	if (nbits <= BITS_PER_LONG)
+	if (small_const_nbits(nbits))
 		*dst = (*src << n) & BITMAP_LAST_WORD_MASK(nbits);
 	else
 		__bitmap_shift_left(dst, src, n, nbits);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PACH] smp: uninline num_online_cpus() & num_possible_cpus()
  2008-12-09  4:03 ` Rusty Russell
@ 2008-12-09  6:36   ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2008-12-09  6:36 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Andrew Morton, linux kernel

Rusty Russell a écrit :
> On Saturday 06 December 2008 04:03:44 Eric Dumazet wrote:
>> num_online_cpus() and num_possible_cpus() are not performance
>> critical and are quite large.
>>
>> Unlining them shrinks kernel text size by 7523 bytes on x86,
>> if NR_CPUS>32
> 
> Hi Eric!
> 
>   Slight misdiagnosis, I think.  One base problem is addressed in fixing
> the bitmap operators (see "[PATCH] bitmap: test for constant as well as
> small size for inline versions" on lkml Message-Id: <200811160907.07140.rusty@rustcorp.com.au>).  This is already in
> linux-next, and I've pasted it below.
> 
>   Worse, you used the obsolete cpumask operators :)
> 

I see ! Good work ;)

So the gain would be 11 bytes per call site, and about one hundred calls,
maybe not worth it :)

Thanks



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-12-09  6:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-05 17:33 [PACH] smp: uninline num_online_cpus() & num_possible_cpus() Eric Dumazet
2008-12-08 22:43 ` Andrew Morton
2008-12-09  4:03 ` Rusty Russell
2008-12-09  6:36   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox