* Optimisation for smp_num_cpus loop in hotplug
@ 2002-06-20 23:41 James Bottomley
2002-06-21 15:14 ` Rusty Russell
0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2002-06-20 23:41 UTC (permalink / raw)
To: Rusty Russell; +Cc: linux-kernel, James.Bottomley
The hotplug CPU patch introduces this to replace the loop over all active CPUs
abstraction:
for (i = 0; i < NR_CPUS; i++) {
if (!cpu_online(i))
continue;
Since the cpu online map is probably going to be quite sparse, could I suggest
this alternative, which doesn't have to loop 32 times:
===== smp.h 1.11 vs edited =====
--- 1.11/include/asm-i386/smp.h Thu Jun 20 14:04:21 2002
+++ edited/smp.h Thu Jun 20 12:39:33 2002
@@ -95,6 +95,11 @@
#define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))
+#define for_each_cpu(cpu, mask) \
+ for(mask = cpu_online_map; \
+ cpu = __ffs(mask), mask != 0; \
+ mask &= ~(1<<cpu))
+
extern inline unsigned int num_online_cpus(void)
{
return hweight32(cpu_online_map);
I've implemented this for my voyager system (8 cpus, but still a sparse online
bitmap), mainly because (for historical reasons), I have to do this loop in
time critical IPI code.
James
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Optimisation for smp_num_cpus loop in hotplug
[not found] <mailman.1024617156.25656.linux-kernel2news@redhat.com>
@ 2002-06-21 4:16 ` Pete Zaitcev
2002-06-21 13:10 ` James Bottomley
0 siblings, 1 reply; 6+ messages in thread
From: Pete Zaitcev @ 2002-06-21 4:16 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel
> for (i = 0; i < NR_CPUS; i++) {
> if (!cpu_online(i))
> continue;
>
> Since the cpu online map is probably going to be quite sparse,
> could I suggest this alternative, which doesn't have to loop 32 times:
> +#define for_each_cpu(cpu, mask) \
> + for(mask = cpu_online_map; \
> + cpu = __ffs(mask), mask != 0; \
> + mask &= ~(1<<cpu))
> +
This is neat, but I'd like to see it work with O(1) as well.
Mingo's code uses some long bitmaps.
-- Pete
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Optimisation for smp_num_cpus loop in hotplug
2002-06-21 4:16 ` Optimisation for smp_num_cpus loop in hotplug Pete Zaitcev
@ 2002-06-21 13:10 ` James Bottomley
0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2002-06-21 13:10 UTC (permalink / raw)
To: Pete Zaitcev; +Cc: James Bottomley, linux-kernel
zaitcev@redhat.com said:
> This is neat, but I'd like to see it work with O(1) as well. Mingo's
> code uses some long bitmaps.
Well, it's strictly only for the cpu bitmap (and it's designed to be different
depending on the arch so you can use arch specific CPU enumeration knowlege).
The O(1) optimisation for this is easy:
#ifdef CONFIG_SMP
#define for_each_cpu(cpu, mask) \
for(mask = cpu_online_map; \
cpu = __ffs(mask), mask != 0; \
mask &= ~(1<<cpu))
#else
#define for_each_cpu(cpu, mask) cpu = 1;
#endif
because you don't really want to run an SMP kernel on a machine which has only
one cpu. Or did you mean you'd like to see it work with the O(1) scheduler,
some kind of generic for_each_set_bit?
James
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Optimisation for smp_num_cpus loop in hotplug
2002-06-20 23:41 James Bottomley
@ 2002-06-21 15:14 ` Rusty Russell
2002-06-21 15:31 ` James Bottomley
0 siblings, 1 reply; 6+ messages in thread
From: Rusty Russell @ 2002-06-21 15:14 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, mingo
In message <200206202341.g5KNfqU07377@localhost.localdomain> you write:
> The hotplug CPU patch introduces this to replace the loop over all active CPUs
> abstraction:
>
> for (i = 0; i < NR_CPUS; i++) {
> if (!cpu_online(i))
> continue;
Yeah, it's simple, and none of the current ones are really critical.
But I think we're better off with:
for (i = first_cpu(); i < NR_CPUS; i = next_cpu(i)) {
Which is simple enough not to need an iterator macro, and also has the
bonus of giving irq-balancing et al. an efficient, portable way of
looking for the "next" cpu.
Cheers!
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
Name: CPU iterator patch
Author: Rusty Russell
Status: Trivial
D: This patch adds first_cpu() & next_cpu(cpu) for more efficient cpu
D: iteration.
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-i386/smp.h working-2.5.24-cpu-iter/include/asm-i386/smp.h
--- linux-2.5.24/include/asm-i386/smp.h Thu Jun 20 01:28:51 2002
+++ working-2.5.24-cpu-iter/include/asm-i386/smp.h Sat Jun 22 00:55:46 2002
@@ -107,6 +107,18 @@
return -1;
}
+/* x86 is always linear, ie. cpu_online(0) always true at the moment. */
+static inline unsigned int first_cpu(void)
+{
+ return 0;
+}
+
+/* Return "next" online cpu. Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+ return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
static __inline int hard_smp_processor_id(void)
{
/* we don't want to mark this access volatile - bad code generation */
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-ia64/smp.h working-2.5.24-cpu-iter/include/asm-ia64/smp.h
--- linux-2.5.24/include/asm-ia64/smp.h Thu Jun 20 01:28:51 2002
+++ working-2.5.24-cpu-iter/include/asm-ia64/smp.h Sat Jun 22 00:59:34 2002
@@ -45,6 +45,17 @@
extern unsigned long ap_wakeup_vector;
+static inline unsigned int first_cpu(void)
+{
+ return __ffs(cpu_online_map);
+}
+
+/* Return "next" online cpu. Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+ return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
#define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))
extern inline unsigned int num_online_cpus(void)
{
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-ppc/smp.h working-2.5.24-cpu-iter/include/asm-ppc/smp.h
--- linux-2.5.24/include/asm-ppc/smp.h Thu Jun 20 01:28:51 2002
+++ working-2.5.24-cpu-iter/include/asm-ppc/smp.h Sat Jun 22 00:58:37 2002
@@ -47,6 +47,17 @@
#define smp_processor_id() (current_thread_info()->cpu)
+static inline unsigned int first_cpu(void)
+{
+ return __ffs(cpu_online_map);
+}
+
+/* Return "next" online cpu. Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+ return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
#define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))
extern inline unsigned int num_online_cpus(void)
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.24/include/asm-sparc64/smp.h working-2.5.24-cpu-iter/include/asm-sparc64/smp.h
--- linux-2.5.24/include/asm-sparc64/smp.h Fri Jun 21 09:41:55 2002
+++ working-2.5.24-cpu-iter/include/asm-sparc64/smp.h Sat Jun 22 01:06:58 2002
@@ -77,6 +77,17 @@
return -1;
}
+static inline unsigned int first_cpu(void)
+{
+ return __ffs(cpu_online_map);
+}
+
+/* Return "next" online cpu. Returns NR_CPUS on end. */
+static inline unsigned int next_cpu(unsigned int cpu)
+{
+ return find_next_bit(&cpu_online_map, NR_CPUS, cpu+1);
+}
+
/*
* General functions that each host system must provide.
*/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Optimisation for smp_num_cpus loop in hotplug
2002-06-21 15:14 ` Rusty Russell
@ 2002-06-21 15:31 ` James Bottomley
2002-06-21 19:17 ` Rusty Russell
0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2002-06-21 15:31 UTC (permalink / raw)
To: Rusty Russell; +Cc: James Bottomley, linux-kernel, mingo
rusty@rustcorp.com.au said:
> Yeah, it's simple, and none of the current ones are really critical.
> But I think we're better off with:
> for (i = first_cpu(); i < NR_CPUS; i = next_cpu(i)) {
> Which is simple enough not to need an iterator macro, and also has the
> bonus of giving irq-balancing et al. an efficient, portable way of
> looking for the "next" cpu.
So you're thinking that next_cpu(i) is something like
__ffs((~(unsigned)((1<<i)-1) & cpu_online_map)
plus an extra exception piece to take next_cpu(i) above NR_CPUS if we have no
remaining CPUs (because __ffs would be undefined)? It's the exception piece
that I don't see how to do really efficiently.
James
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Optimisation for smp_num_cpus loop in hotplug
2002-06-21 15:31 ` James Bottomley
@ 2002-06-21 19:17 ` Rusty Russell
0 siblings, 0 replies; 6+ messages in thread
From: Rusty Russell @ 2002-06-21 19:17 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, mingo
In message <200206211531.g5LFViZ07396@localhost.localdomain> you write:
> rusty@rustcorp.com.au said:
> > Yeah, it's simple, and none of the current ones are really critical.
> > But I think we're better off with:
> > for (i = first_cpu(); i < NR_CPUS; i = next_cpu(i)) {
>
> > Which is simple enough not to need an iterator macro, and also has the
> > bonus of giving irq-balancing et al. an efficient, portable way of
> > looking for the "next" cpu.
>
> So you're thinking that next_cpu(i) is something like
>
> __ffs((~(unsigned)((1<<i)-1) & cpu_online_map)
>
> plus an extra exception piece to take next_cpu(i) above NR_CPUS if we have no
> remaining CPUs (because __ffs would be undefined)? It's the exception piece
> that I don't see how to do really efficiently.
find_next_bit already does this, but the generic one would look
something like:
unsigned long mask = ~(unsigned long)((1<<(cpu+1))-1);
if (mask & cpu_online_map)
return _ffs(mask & cpu_online_map);
return NR_CPUS;
Cheers!
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-06-21 19:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mailman.1024617156.25656.linux-kernel2news@redhat.com>
2002-06-21 4:16 ` Optimisation for smp_num_cpus loop in hotplug Pete Zaitcev
2002-06-21 13:10 ` James Bottomley
2002-06-20 23:41 James Bottomley
2002-06-21 15:14 ` Rusty Russell
2002-06-21 15:31 ` James Bottomley
2002-06-21 19:17 ` Rusty Russell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox