public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Optimisation for smp_num_cpus loop in hotplug
@ 2002-06-20 23:41 James Bottomley
  2002-06-21 15:14 ` Rusty Russell
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2002-06-20 23:41 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, James.Bottomley

The hotplug CPU patch introduces this to replace the loop over all active CPUs 
abstraction:

	for (i = 0; i < NR_CPUS; i++) {
		if (!cpu_online(i))
			continue;

Since the cpu online map is probably going to be quite sparse, could I suggest 
this alternative, which doesn't have to loop 32 times:

===== smp.h 1.11 vs edited =====
--- 1.11/include/asm-i386/smp.h Thu Jun 20 14:04:21 2002
+++ edited/smp.h        Thu Jun 20 12:39:33 2002
@@ -95,6 +95,11 @@
 
 #define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))
 
+#define for_each_cpu(cpu, mask) \
+       for(mask = cpu_online_map; \
+           cpu = __ffs(mask), mask != 0; \
+           mask &= ~(1<<cpu))
+
 extern inline unsigned int num_online_cpus(void)
 {
        return hweight32(cpu_online_map);

I've implemented this for my voyager system (8 cpus, but still a sparse online 
bitmap), mainly because (for historical reasons), I have to do this loop in 
time critical IPI code.

James 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Optimisation for smp_num_cpus loop in hotplug
       [not found] <mailman.1024617156.25656.linux-kernel2news@redhat.com>
@ 2002-06-21  4:16 ` Pete Zaitcev
  2002-06-21 13:10   ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Pete Zaitcev @ 2002-06-21  4:16 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

> 	for (i = 0; i < NR_CPUS; i++) {
> 		if (!cpu_online(i))
> 			continue;
> 
> Since the cpu online map is probably going to be quite sparse,
> could I suggest this alternative, which doesn't have to loop 32 times:

> +#define for_each_cpu(cpu, mask) \
> +       for(mask = cpu_online_map; \
> +           cpu = __ffs(mask), mask != 0; \
> +           mask &= ~(1<<cpu))
> +

This is neat, but I'd like to see it work with O(1) as well.
Mingo's code uses some long bitmaps.

-- Pete

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Optimisation for smp_num_cpus loop in hotplug
  2002-06-21  4:16 ` Pete Zaitcev
@ 2002-06-21 13:10   ` James Bottomley
  0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2002-06-21 13:10 UTC (permalink / raw)
  To: Pete Zaitcev; +Cc: James Bottomley, linux-kernel

zaitcev@redhat.com said:
> This is neat, but I'd like to see it work with O(1) as well. Mingo's
> code uses some long bitmaps. 

Well, it's strictly only for the cpu bitmap (and it's designed to be different 
depending on the arch so you can use arch specific CPU enumeration knowlege).  
The O(1) optimisation for this is easy:

#ifdef CONFIG_SMP
#define for_each_cpu(cpu, mask) \
        for(mask = cpu_online_map; \
            cpu = __ffs(mask), mask != 0; \
            mask &= ~(1<<cpu))
#else
#define for_each_cpu(cpu, mask) cpu = 1; 
#endif

because you don't really want to run an SMP kernel on a machine which has only 
one cpu.  Or did you mean you'd like to see it work with the O(1) scheduler, 
some kind of generic for_each_set_bit?

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Optimisation for smp_num_cpus loop in hotplug
  2002-06-20 23:41 Optimisation for smp_num_cpus loop in hotplug James Bottomley
@ 2002-06-21 15:14 ` Rusty Russell
  2002-06-21 15:31   ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Rusty Russell @ 2002-06-21 15:14 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel, mingo

In message <200206202341.g5KNfqU07377@localhost.localdomain> you write:
> The hotplug CPU patch introduces this to replace the loop over all active CPUs 
> abstraction:
> 
> 	for (i = 0; i < NR_CPUS; i++) {
> 		if (!cpu_online(i))
> 			continue;

Yeah, it's simple, and none of the current ones are really critical.
But I think we're better off with:
	for (i = first_cpu(); i < NR_CPUS; i = next_cpu(i)) {

Which is simple enough not to need an iterator macro, and also has the
bonus of giving irq-balancing et al. an efficient, portable way of
looking for the "next" cpu.

Cheers!
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Name: CPU iterator patch
Author: Rusty Russell
Status: Trivial

D: This patch adds first_cpu() & next_cpu(cpu) for more efficient cpu
D: iteration.

diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-i386/smp.h working-2.5.24-cpu-iter/include/asm-i386/smp.h
--- linux-2.5.24/include/asm-i386/smp.h	Thu Jun 20 01:28:51 2002
+++ working-2.5.24-cpu-iter/include/asm-i386/smp.h	Sat Jun 22 00:55:46 2002
@@ -107,6 +107,18 @@
 	return -1;
 }
 
+/* x86 is always linear, ie. cpu_online(0) always true at the moment. */
+static inline unsigned int first_cpu(void)
+{
+	return 0;
+}
+
+/* Return "next" online cpu.  Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+	return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
 static __inline int hard_smp_processor_id(void)
 {
 	/* we don't want to mark this access volatile - bad code generation */
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-ia64/smp.h working-2.5.24-cpu-iter/include/asm-ia64/smp.h
--- linux-2.5.24/include/asm-ia64/smp.h	Thu Jun 20 01:28:51 2002
+++ working-2.5.24-cpu-iter/include/asm-ia64/smp.h	Sat Jun 22 00:59:34 2002
@@ -45,6 +45,17 @@
 
 extern unsigned long ap_wakeup_vector;
 
+static inline unsigned int first_cpu(void)
+{
+	return __ffs(cpu_online_map);
+}
+
+/* Return "next" online cpu.  Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+	return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
 #define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))
 extern inline unsigned int num_online_cpus(void)
 {
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-ppc/smp.h working-2.5.24-cpu-iter/include/asm-ppc/smp.h
--- linux-2.5.24/include/asm-ppc/smp.h	Thu Jun 20 01:28:51 2002
+++ working-2.5.24-cpu-iter/include/asm-ppc/smp.h	Sat Jun 22 00:58:37 2002
@@ -47,6 +47,17 @@
 
 #define smp_processor_id() (current_thread_info()->cpu)
 
+static inline unsigned int first_cpu(void)
+{
+	return __ffs(cpu_online_map);
+}
+
+/* Return "next" online cpu.  Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+	return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
 #define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))
 
 extern inline unsigned int num_online_cpus(void)
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-sparc64/smp.h working-2.5.24-cpu-iter/include/asm-sparc64/smp.h
--- linux-2.5.24/include/asm-sparc64/smp.h	Fri Jun 21 09:41:55 2002
+++ working-2.5.24-cpu-iter/include/asm-sparc64/smp.h	Sat Jun 22 01:06:58 2002
@@ -77,6 +77,17 @@
 	return -1;
 }
 
+static inline unsigned int first_cpu(void)
+{
+	return __ffs(cpu_online_map);
+}
+
+/* Return "next" online cpu.  Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+	return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
 /*
  *	General functions that each host system must provide.
  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Optimisation for smp_num_cpus loop in hotplug
  2002-06-21 15:14 ` Rusty Russell
@ 2002-06-21 15:31   ` James Bottomley
  2002-06-21 19:17     ` Rusty Russell
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2002-06-21 15:31 UTC (permalink / raw)
  To: Rusty Russell; +Cc: James Bottomley, linux-kernel, mingo

rusty@rustcorp.com.au said:
> Yeah, it's simple, and none of the current ones are really critical.
> But I think we're better off with:
> 	for (i = first_cpu(); i < NR_CPUS; i = next_cpu(i)) {

> Which is simple enough not to need an iterator macro, and also has the
> bonus of giving irq-balancing et al. an efficient, portable way of
> looking for the "next" cpu. 

So you're thinking that next_cpu(i) is something like

__ffs((~(unsigned)((1<<i)-1) & cpu_online_map)

plus an extra exception piece to take next_cpu(i) above NR_CPUS if we have no 
remaining CPUs (because __ffs would be undefined)?  It's the exception piece 
that I don't see how to do really efficiently.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Optimisation for smp_num_cpus loop in hotplug
  2002-06-21 15:31   ` James Bottomley
@ 2002-06-21 19:17     ` Rusty Russell
  0 siblings, 0 replies; 6+ messages in thread
From: Rusty Russell @ 2002-06-21 19:17 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel, mingo

In message <200206211531.g5LFViZ07396@localhost.localdomain> you write:
> rusty@rustcorp.com.au said:
> > Yeah, it's simple, and none of the current ones are really critical.
> > But I think we're better off with:
> > 	for (i = first_cpu(); i < NR_CPUS; i = next_cpu(i)) {
> 
> > Which is simple enough not to need an iterator macro, and also has the
> > bonus of giving irq-balancing et al. an efficient, portable way of
> > looking for the "next" cpu. 
> 
> So you're thinking that next_cpu(i) is something like
> 
> __ffs((~(unsigned)((1<<i)-1) & cpu_online_map)
> 
> plus an extra exception piece to take next_cpu(i) above NR_CPUS if we have no
 
> remaining CPUs (because __ffs would be undefined)?  It's the exception piece 
> that I don't see how to do really efficiently.

find_next_bit already does this, but the generic one would look
something like:

	unsigned long mask = ~(unsigned long)((1<<(cpu+1))-1);
	if (mask & cpu_online_map)
		return _ffs(mask & cpu_online_map);
	return NR_CPUS;

Cheers!
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-06-21 19:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-20 23:41 Optimisation for smp_num_cpus loop in hotplug James Bottomley
2002-06-21 15:14 ` Rusty Russell
2002-06-21 15:31   ` James Bottomley
2002-06-21 19:17     ` Rusty Russell
     [not found] <mailman.1024617156.25656.linux-kernel2news@redhat.com>
2002-06-21  4:16 ` Pete Zaitcev
2002-06-21 13:10   ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox