[Patch] SMP call function cleanup

public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed

* [Patch] SMP call function cleanup
@ 2004-04-22 12:21 Jan Glauber
  2004-04-22 12:28 ` William Lee Irwin III
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Jan Glauber @ 2004-04-22 12:21 UTC (permalink / raw)
  To: linux-arch; +Cc: schwidefsky

Hello,

I've been looking into the SMP call function stuff on different
archs and found many different functions...

In the common code part there are 2 functions:
smp_call_function()  	// call a function on all CPUs but my own
on_each_cpu()		// call a function on all CPUs

Many archs need an additional function to call a function on a 
specific CPU:

arch-s390:
smp_call_function_on()

arch-alpha:
smp_call_function_on_cpu()

arch-ia64:
smp_call_function_single()

On i386 there is no smp_call_function_single() so they have a workaround
with smp_call_function() and testing for smp_processor_id().
Finally the slab allocator has its own on_each_cpu() function:

mm:
smp_call_function_all_cpus()

This is somehow inconsistent.

Proposed cleanup:

there are 3 different kinds of SMP calls:
1. all CPUs
2. all CPUs but my own
3. one CPU

only _one_ basic function is needed to implement all variants:

This can be named on_cpu() and has as parameter a cpumask,
the function is called on all CPUs specified there:

void on_cpu(void (*func) (void *info), void *info, int retry,
            int wait, cpumask_t map);

Each architecture should implement this call. 

The 3 variants can then be implemented in common code as small inlines:

on_each_cpu      -> on_cpu( ... , cpu_online_map) 
on_other_cpus()  -> on_cpu( ... , cpu_online_map & ~smp_processor_id()) 
on_single_cpu()  -> on_cpu( ... , cpu_set(cpu, map))

Then each architecture needs only one function that implements smp calls. 
Beside the consistent naming this names are also shorter then the original.

I've built a patch for s390 & i386.

Jan

---
Jan Glauber
Linux on zSeries Development
IBM Deutschland Entwicklung GmbH
Phone: +49 7031 161911   Mail: jang@de.ibm.com


diff -urN linux-2.6.5/arch/i386/kernel/cpu/mtrr/main.c linux-2.6.5_smp/arch/i386/kernel/cpu/mtrr/main.c
--- linux-2.6.5/arch/i386/kernel/cpu/mtrr/main.c	2004-04-04 05:36:16.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/cpu/mtrr/main.c	2004-04-14 14:31:18.000000000 +0200
@@ -223,9 +223,7 @@
 	atomic_set(&data.gate,0);
 
 	/*  Start the ball rolling on other CPUs  */
-	if (smp_call_function(ipi_handler, &data, 1, 0) != 0)
-		panic("mtrr: timed out waiting for other CPUs\n");
-
+	on_other_cpus(ipi_handler, &data, 1, 0);
 	local_irq_save(flags);
 
 	while(atomic_read(&data.count)) {
diff -urN linux-2.6.5/arch/i386/kernel/cpuid.c linux-2.6.5_smp/arch/i386/kernel/cpuid.c
--- linux-2.6.5/arch/i386/kernel/cpuid.c	2004-04-04 05:36:12.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/cpuid.c	2004-04-14 14:52:12.000000000 +0200
@@ -55,8 +55,7 @@
 {
   struct cpuid_command *cmd = (struct cpuid_command *) cmd_block;
   
-  if ( cmd->cpu == smp_processor_id() )
-    cpuid(cmd->reg, &cmd->data[0], &cmd->data[1], &cmd->data[2], &cmd->data[3]);
+  cpuid(cmd->reg, &cmd->data[0], &cmd->data[1], &cmd->data[2], &cmd->data[3]);
 }
 
 static inline void do_cpuid(int cpu, u32 reg, u32 *data)
@@ -71,7 +70,7 @@
     cmd.reg  = reg;
     cmd.data = data;
     
-    smp_call_function(cpuid_smp_cpuid, &cmd, 1, 1);
+    on_single_cpu(cpuid_smp_cpuid, &cmd, 1, 1, cpu);
   }
   preempt_enable();
 }
diff -urN linux-2.6.5/arch/i386/kernel/i386_ksyms.c linux-2.6.5_smp/arch/i386/kernel/i386_ksyms.c
--- linux-2.6.5/arch/i386/kernel/i386_ksyms.c	2004-04-04 05:38:23.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/i386_ksyms.c	2004-04-14 16:21:47.000000000 +0200
@@ -148,7 +148,6 @@
 
 /* Global SMP stuff */
 EXPORT_SYMBOL(synchronize_irq);
-EXPORT_SYMBOL(smp_call_function);
 
 /* TLB flushing */
 EXPORT_SYMBOL(flush_tlb_page);
diff -urN linux-2.6.5/arch/i386/kernel/ldt.c linux-2.6.5_smp/arch/i386/kernel/ldt.c
--- linux-2.6.5/arch/i386/kernel/ldt.c	2004-04-04 05:37:37.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/ldt.c	2004-04-14 14:53:18.000000000 +0200
@@ -61,7 +61,7 @@
 		load_LDT(pc);
 		mask = cpumask_of_cpu(smp_processor_id());
 		if (!cpus_equal(current->mm->cpu_vm_mask, mask))
-			smp_call_function(flush_ldt, 0, 1, 1);
+			on_other_cpus(flush_ldt, 0, 1, 1);
 		preempt_enable();
 #else
 		load_LDT(pc);
diff -urN linux-2.6.5/arch/i386/kernel/msr.c linux-2.6.5_smp/arch/i386/kernel/msr.c
--- linux-2.6.5/arch/i386/kernel/msr.c	2004-04-04 05:36:57.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/msr.c	2004-04-14 15:15:15.000000000 +0200
@@ -99,16 +99,14 @@
 {
   struct msr_command *cmd = (struct msr_command *) cmd_block;
   
-  if ( cmd->cpu == smp_processor_id() )
-    cmd->err = wrmsr_eio(cmd->reg, cmd->data[0], cmd->data[1]);
+  cmd->err = wrmsr_eio(cmd->reg, cmd->data[0], cmd->data[1]);
 }
 
 static void msr_smp_rdmsr(void *cmd_block)
 {
   struct msr_command *cmd = (struct msr_command *) cmd_block;
   
-  if ( cmd->cpu == smp_processor_id() )
-    cmd->err = rdmsr_eio(cmd->reg, &cmd->data[0], &cmd->data[1]);
+  cmd->err = rdmsr_eio(cmd->reg, &cmd->data[0], &cmd->data[1]);
 }
 
 static inline int do_wrmsr(int cpu, u32 reg, u32 eax, u32 edx)
@@ -125,7 +123,7 @@
     cmd.data[0] = eax;
     cmd.data[1] = edx;
     
-    smp_call_function(msr_smp_wrmsr, &cmd, 1, 1);
+    on_single_cpu(msr_smp_wrmsr, &cmd, 1, 1, cpu);
     ret = cmd.err;
   }
   preempt_enable();
@@ -144,7 +142,7 @@
     cmd.cpu = cpu;
     cmd.reg = reg;
 
-    smp_call_function(msr_smp_rdmsr, &cmd, 1, 1);
+    on_single_cpu(msr_smp_rdmsr, &cmd, 1, 1, cpu);
     
     *eax = cmd.data[0];
     *edx = cmd.data[1];
diff -urN linux-2.6.5/arch/i386/kernel/reboot.c linux-2.6.5_smp/arch/i386/kernel/reboot.c
--- linux-2.6.5/arch/i386/kernel/reboot.c	2004-04-04 05:36:54.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/reboot.c	2004-04-14 15:20:35.000000000 +0200
@@ -239,7 +239,7 @@
 		   cleared reboot_smp, and do the reboot if it is the
 		   correct CPU, otherwise it halts. */
 		if (reboot_cpu != cpuid)
-			smp_call_function((void *)machine_restart , NULL, 1, 0);
+			on_other_cpus((void *)machine_restart , NULL, 1, 0);
 	}
 
 	/* if reboot_cpu is still -1, then we want a tradional reboot, 
diff -urN linux-2.6.5/arch/i386/kernel/smp.c linux-2.6.5_smp/arch/i386/kernel/smp.c
--- linux-2.6.5/arch/i386/kernel/smp.c	2004-04-04 05:36:18.000000000 +0200
+++ linux-2.6.5_smp/arch/i386/kernel/smp.c	2004-04-22 13:26:58.000000000 +0200
@@ -478,7 +478,7 @@
 }
 
 /*
- * Structure and data for smp_call_function(). This is designed to minimise
+ * Structure and data for on_cpu(). This is designed to minimise
  * static memory requirements. It also looks cleaner.
  */
 static spinlock_t call_lock = SPIN_LOCK_UNLOCKED;
@@ -486,64 +486,77 @@
 struct call_data_struct {
 	void (*func) (void *info);
 	void *info;
-	atomic_t started;
-	atomic_t finished;
+	cpumask_t started;
+	cpumask_t finished;
 	int wait;
 };
 
 static struct call_data_struct * call_data;
 
 /*
- * this function sends a 'generic call function' IPI to all other CPUs
- * in the system.
- */
-
-int smp_call_function (void (*func) (void *info), void *info, int nonatomic,
-			int wait)
-/*
- * [SUMMARY] Run a function on all other CPUs.
- * <func> The function to run. This must be fast and non-blocking.
- * <info> An arbitrary pointer to pass to the function.
- * <nonatomic> currently unused.
- * <wait> If true, wait (atomically) until function has completed on other CPUs.
- * [RETURNS] 0 on success, else a negative status code. Does not return until
- * remote CPUs are nearly ready to execute <<func>> or are or have executed.
+ * [Summary]    Run a function on all specified CPUs.
+ * <func>       The function to run. This must be fast and non-blocking.
+ * <info>       An arbitrary pointer to pass to the function.
+ * <nonatomic>  currently unused.
+ * <wait>       If true, wait atomically until function has completed on
+ * 		other CPUs.
+ * <map>        All CPUs where the function should run.
  *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
+ * Does not return until remote CPUs are nearly ready to execute <func> 
+ * or are or have executed.
+ *
+ * You must not call this function with disabled interrupts or from a hardware
+ * interrupt handler. You must call this function with preemption disabled.
  */
+void on_cpu(void (*func) (void *info), void *info, int nonatomic,
+	    int wait, cpumask_t map)
 {
 	struct call_data_struct data;
-	int cpus = num_online_cpus()-1;
+	int local = 0;
+
+	/*
+	 * Check for local function call.
+	 * The local call comes after the remote call,
+	 * otherwise machine_restart_smp() doesn't work.
+	 */
+	if (cpu_isset(smp_processor_id(), map)) {
+		local = 1;
+		cpu_clear(smp_processor_id(), map);
+	}
 
-	if (!cpus)
-		return 0;
+	cpus_and(map, map, cpu_online_map);
+
+	if (cpus_empty(map))
+		goto out;
 
 	data.func = func;
-	data.info = info;
-	atomic_set(&data.started, 0);
+        data.info = info;
+	cpus_clear(data.started);
 	data.wait = wait;
 	if (wait)
-		atomic_set(&data.finished, 0);
+		cpus_clear(data.finished);
 
-	spin_lock(&call_lock);
+	spin_lock_bh(&call_lock);
 	call_data = &data;
 	mb();
-	
-	/* Send a message to all other CPUs and wait for them to respond */
-	send_IPI_allbutself(CALL_FUNCTION_VECTOR);
+
+	/* call the cross CPU functions */
+	send_IPI_mask(map, CALL_FUNCTION_VECTOR);
 
 	/* Wait for response */
-	while (atomic_read(&data.started) != cpus)
+	while (!cpus_equal(map, data.started))
 		barrier();
 
 	if (wait)
-		while (atomic_read(&data.finished) != cpus)
+		while (!cpus_equal(map, data.finished))
 			barrier();
-	spin_unlock(&call_lock);
 
-	return 0;
+	spin_unlock_bh(&call_lock);
+ out:
+	if (local)
+		func(info);
 }
+EXPORT_SYMBOL(on_cpu);
 
 static void stop_this_cpu (void * dummy)
 {
@@ -564,7 +577,7 @@
 
 void smp_send_stop(void)
 {
-	smp_call_function(stop_this_cpu, NULL, 1, 0);
+	on_other_cpus(stop_this_cpu, NULL, 1, 0);
 
 	local_irq_disable();
 	disable_local_APIC();
@@ -593,7 +606,7 @@
 	 * about to execute the function
 	 */
 	mb();
-	atomic_inc(&call_data->started);
+	cpu_set(smp_processor_id(), call_data->started);
 	/*
 	 * At this point the info structure may be out of scope unless wait==1
 	 */
@@ -603,7 +616,7 @@
 
 	if (wait) {
 		mb();
-		atomic_inc(&call_data->finished);
+		cpu_set(smp_processor_id(), call_data->finished);
 	}
 }
 
diff -urN linux-2.6.5/arch/s390/appldata/appldata_base.c linux-2.6.5_smp/arch/s390/appldata/appldata_base.c
--- linux-2.6.5/arch/s390/appldata/appldata_base.c	2004-04-04 05:36:48.000000000 +0200
+++ linux-2.6.5_smp/arch/s390/appldata/appldata_base.c	2004-04-14 15:41:24.000000000 +0200
@@ -189,7 +189,7 @@
 /*
  * appldata_mod_vtimer_wrap()
  *
- * wrapper function for mod_virt_timer(), because smp_call_function_on()
+ * wrapper function for mod_virt_timer(), because on_single_cpu()
  * accepts only one parameter.
  */
 static void appldata_mod_vtimer_wrap(struct appldata_mod_vtimer_args *args) {
@@ -281,9 +281,8 @@
 	if ((buf[0] == '1') && (!appldata_timer_active)) {
 		for (i = 0; i < num_online_cpus(); i++) {
 			per_cpu(appldata_timer, i).expires = per_cpu_interval;
-			smp_call_function_on(add_virt_timer_periodic,
-						&per_cpu(appldata_timer, i),
-						0, 1, i);
+			on_single_cpu(add_virt_timer_periodic,
+				      &per_cpu(appldata_timer, i), 0, 1, i);
 		}
 		appldata_timer_active = 1;
 		P_INFO("Monitoring timer started.\n");
@@ -346,10 +345,8 @@
 					&per_cpu(appldata_timer, i);
 			appldata_mod_vtimer_args.expires =
 					per_cpu_interval;
-			smp_call_function_on(
-				(void *) appldata_mod_vtimer_wrap,
-				&appldata_mod_vtimer_args,
-				0, 1, i);
+			on_single_cpu((void *) appldata_mod_vtimer_wrap,
+				      &appldata_mod_vtimer_args, 0, 1, i);
 		}
 	}
 	spin_unlock(&appldata_timer_lock);
diff -urN linux-2.6.5/arch/s390/kernel/smp.c linux-2.6.5_smp/arch/s390/kernel/smp.c
--- linux-2.6.5/arch/s390/kernel/smp.c	2004-04-04 05:36:13.000000000 +0200
+++ linux-2.6.5_smp/arch/s390/kernel/smp.c	2004-04-22 13:25:44.000000000 +0200
@@ -66,11 +66,10 @@
 
 extern void do_reipl(unsigned long devno);
 
-static void smp_ext_bitcall(int, ec_bit_sig);
-static void smp_ext_bitcall_others(ec_bit_sig);
+static inline void smp_ext_bitcall(int, ec_bit_sig);
 
 /*
- * Structure and data for smp_call_function(). This is designed to minimise
+ * Structure and data for on_cpu(). This is designed to minimise
  * static memory requirements. It also looks cleaner.
  */
 static spinlock_t call_lock = SPIN_LOCK_UNLOCKED;
@@ -78,8 +77,8 @@
 struct call_data_struct {
 	void (*func) (void *info);
 	void *info;
-	atomic_t started;
-	atomic_t finished;
+	cpumask_t started;
+	cpumask_t finished;
 	int wait;
 };
 
@@ -94,116 +93,78 @@
 	void *info = call_data->info;
 	int wait = call_data->wait;
 
-	atomic_inc(&call_data->started);
+	cpu_set(smp_processor_id(), call_data->started);
 	(*func)(info);
 	if (wait)
-		atomic_inc(&call_data->finished);
+		cpu_set(smp_processor_id(), call_data->finished);
 }
 
 /*
- * this function sends a 'generic call function' IPI to all other CPUs
- * in the system.
- */
-
-int smp_call_function (void (*func) (void *info), void *info, int nonatomic,
-			int wait)
-/*
- * [SUMMARY] Run a function on all other CPUs.
- * <func> The function to run. This must be fast and non-blocking.
- * <info> An arbitrary pointer to pass to the function.
- * <nonatomic> currently unused.
- * <wait> If true, wait (atomically) until function has completed on other CPUs.
- * [RETURNS] 0 on success, else a negative status code. Does not return until
- * remote CPUs are nearly ready to execute <<func>> or are or have executed.
+ * [Summary]    Run a function on all specified CPUs.
+ * <func>       The function to run. This must be fast and non-blocking.
+ * <info>       An arbitrary pointer to pass to the function.
+ * <nonatomic>  currently unused.
+ * <wait>       If true, wait atomically until function has completed on
+ * 		other CPUs.
+ * <map>        All CPUs where the function should run.
  *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- */
-{
-	struct call_data_struct data;
-	int cpus = num_online_cpus()-1;
-
-	/* FIXME: get cpu lock -hc */
-	if (cpus <= 0)
-		return 0;
-
-	data.func = func;
-	data.info = info;
-	atomic_set(&data.started, 0);
-	data.wait = wait;
-	if (wait)
-		atomic_set(&data.finished, 0);
-
-	spin_lock(&call_lock);
-	call_data = &data;
-	/* Send a message to all other CPUs and wait for them to respond */
-        smp_ext_bitcall_others(ec_call_function);
-
-	/* Wait for response */
-	while (atomic_read(&data.started) != cpus)
-		cpu_relax();
-
-	if (wait)
-		while (atomic_read(&data.finished) != cpus)
-			cpu_relax();
-	spin_unlock(&call_lock);
-
-	return 0;
-}
-
-/*
- * Call a function on one CPU
- * cpu : the CPU the function should be executed on
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler. You may call it from a bottom half.
+ * Does not return until remote CPUs are nearly ready to execute <func> 
+ * or are or have executed.
  *
- * It is guaranteed that the called function runs on the specified CPU,
- * preemption is disabled.
+ * You must not call this function with disabled interrupts or from a hardware
+ * interrupt handler. You must call this function with preemption disabled.
+ * XXX You may call it from a bottom half handler.
  */
-int smp_call_function_on(void (*func) (void *info), void *info,
-			 int nonatomic, int wait, int cpu)
+void on_cpu(void (*func) (void *info), void *info, int nonatomic,
+	    int wait, cpumask_t map)
 {
 	struct call_data_struct data;
-	int curr_cpu;
+	int cpu, local = 0;
 
-	if (!cpu_online(cpu))
-		return -EINVAL;
+	/*
+	 * Check for local function call.
+	 * In on_each_cpu() the local call comes after the remote call,
+	 * we have to call it in the same order, else machine_restart_smp()
+	 * doesn't work.
+	 */
+	if (cpu_isset(smp_processor_id(), map)) {
+		local = 1;
+		cpu_clear(smp_processor_id(), map);
+	}
 
-	/* disable preemption for local function call */
-	curr_cpu = get_cpu();
+	cpus_and(map, map, cpu_online_map);
 
-	if (curr_cpu == cpu) {
-		/* direct call to function */
-		func(info);
-		put_cpu();
-		return 0;
-	}
+	if (cpus_empty(map))
+		goto out;
 
 	data.func = func;
 	data.info = info;
-	atomic_set(&data.started, 0);
+	cpus_clear(data.started);
 	data.wait = wait;
 	if (wait)
-		atomic_set(&data.finished, 0);
+		cpus_clear(data.finished);
 
 	spin_lock_bh(&call_lock);
 	call_data = &data;
-	smp_ext_bitcall(cpu, ec_call_function);
+
+	/* call the cross CPU functions */
+	for_each_cpu_mask(cpu, map)
+		smp_ext_bitcall(cpu, ec_call_function);
 
 	/* Wait for response */
-	while (atomic_read(&data.started) != 1)
+	while (!cpus_equal(map, data.started))
 		cpu_relax();
 
 	if (wait)
-		while (atomic_read(&data.finished) != 1)
+		while (!cpus_equal(map, data.finished))
 			cpu_relax();
 
 	spin_unlock_bh(&call_lock);
-	put_cpu();
-	return 0;
+ out:
+	if (local)
+		func(info);
 }
-EXPORT_SYMBOL(smp_call_function_on);
+EXPORT_SYMBOL(on_cpu);
 
 static inline void do_send_stop(void)
 {
@@ -357,10 +318,9 @@
 }
 
 /*
- * Send an external call sigp to another cpu and return without waiting
- * for its completion.
+ * Send an external call sigp to another cpu and wait for its completion.
  */
-static void smp_ext_bitcall(int cpu, ec_bit_sig sig)
+static inline void smp_ext_bitcall(int cpu, ec_bit_sig sig)
 {
         /*
          * Set signaling bit in lowcore of target cpu and kick it
@@ -370,26 +330,6 @@
 		udelay(10);
 }
 
-/*
- * Send an external call sigp to every other cpu in the system and
- * return without waiting for its completion.
- */
-static void smp_ext_bitcall_others(ec_bit_sig sig)
-{
-        int i;
-
-        for (i = 0; i < NR_CPUS; i++) {
-                if (!cpu_online(i) || smp_processor_id() == i)
-                        continue;
-                /*
-                 * Set signaling bit in lowcore of target cpu and kick it
-                 */
-		set_bit(sig, (unsigned long *) &lowcore_ptr[i]->ext_call_fast);
-                while (signal_processor(i, sigp_external_call) == sigp_busy)
-			udelay(10);
-        }
-}
-
 #ifndef CONFIG_ARCH_S390X
 /*
  * this function sends a 'purge tlb' signal to another CPU.
@@ -453,7 +393,7 @@
 	parms.orvals[cr] = 1 << bit;
 	parms.andvals[cr] = -1L;
 	preempt_disable();
-	smp_call_function(smp_ctl_bit_callback, &parms, 0, 1);
+	on_other_cpus(smp_ctl_bit_callback, &parms, 0, 1);
         __ctl_set_bit(cr, bit);
 	preempt_enable();
 }
@@ -469,7 +409,7 @@
 	parms.orvals[cr] = 0;
 	parms.andvals[cr] = ~(1L << bit);
 	preempt_disable();
-	smp_call_function(smp_ctl_bit_callback, &parms, 0, 1);
+	on_other_cpus(smp_ctl_bit_callback, &parms, 0, 1);
         __ctl_clear_bit(cr, bit);
 	preempt_enable();
 }
@@ -652,4 +592,4 @@
 EXPORT_SYMBOL(lowcore_ptr);
 EXPORT_SYMBOL(smp_ctl_set_bit);
 EXPORT_SYMBOL(smp_ctl_clear_bit);
-EXPORT_SYMBOL(smp_call_function);
+
diff -urN linux-2.6.5/drivers/s390/net/iucv.c linux-2.6.5_smp/drivers/s390/net/iucv.c
--- linux-2.6.5/drivers/s390/net/iucv.c	2004-04-04 05:36:18.000000000 +0200
+++ linux-2.6.5_smp/drivers/s390/net/iucv.c	2004-04-14 16:11:17.000000000 +0200
@@ -670,12 +670,7 @@
 	ulong b2f0_result = 0x0deadbeef;
 
 	iucv_debug(1, "entering");
-	preempt_disable();
-	if (smp_processor_id() == 0)
-		iucv_declare_buffer_cpu0(&b2f0_result);
-	else
-		smp_call_function(iucv_declare_buffer_cpu0, &b2f0_result, 0, 1);
-	preempt_enable();
+	on_single_cpu(iucv_declare_buffer_cpu0, &b2f0_result, 0, 1, 0);
 	iucv_debug(1, "Address of EIB = %p", iucv_external_int_buffer);
 	if (b2f0_result == 0x0deadbeef)
 	    b2f0_result = 0xaa;
@@ -694,13 +689,8 @@
 {
 	iucv_debug(1, "entering");
 	if (declare_flag) {
-		preempt_disable();
-		if (smp_processor_id() == 0)
-			iucv_retrieve_buffer_cpu0(0);
-		else
-			smp_call_function(iucv_retrieve_buffer_cpu0, 0, 0, 1);
+		on_single_cpu(iucv_retrieve_buffer_cpu0, 0, 0, 1, 0);
 		declare_flag = 0;
-		preempt_enable();
 	}
 	iucv_debug(1, "exiting");
 	return 0;
@@ -2220,13 +2210,7 @@
 	} u;
 
 	u.param = SetMaskFlag;
-	preempt_disable();
-	if (smp_processor_id() == 0)
-		iucv_setmask_cpu0(&u);
-	else
-		smp_call_function(iucv_setmask_cpu0, &u, 0, 1);
-	preempt_enable();
-
+	on_single_cpu(iucv_setmask_cpu0, &u, 0, 1, 0);
 	return u.result;
 }
 
diff -urN linux-2.6.5/include/linux/smp.h linux-2.6.5_smp/include/linux/smp.h
--- linux-2.6.5/include/linux/smp.h	2004-04-04 05:37:23.000000000 +0200
+++ linux-2.6.5_smp/include/linux/smp.h	2004-04-14 13:54:35.000000000 +0200
@@ -49,10 +49,40 @@
 extern void smp_cpus_done(unsigned int max_cpus);
 
 /*
- * Call a function on all other processors
+ * Call a function on the specified processors
  */
-extern int smp_call_function (void (*func) (void *info), void *info,
-			      int retry, int wait);
+extern void on_cpu(void (*func) (void *info), void *info, int retry,
+		   int wait, cpumask_t map);
+
+/*
+ * Call a function on one processor.
+ */
+static inline void on_single_cpu(void (*func) (void *info), void *info,
+				 int nonatomic, int wait, int cpu)
+{
+	cpumask_t map;
+
+	preempt_disable();
+	cpus_clear(map);
+	cpu_set(cpu, map);
+	on_cpu(func, info, nonatomic, wait, map);
+	preempt_enable();
+}
+
+/*
+ * Call a function on all other processors.
+ */
+static inline void on_other_cpus(void (*func) (void *info), void *info, 
+				 int nonatomic, int wait)
+{
+	cpumask_t map;
+
+	preempt_disable();
+	map = cpu_online_map;
+	cpu_clear(smp_processor_id(), map);
+	on_cpu(func, info, nonatomic, wait, map);
+	preempt_enable();
+}
 
 /*
  * Call a function on all processors
@@ -60,13 +90,10 @@
 static inline int on_each_cpu(void (*func) (void *info), void *info,
 			      int retry, int wait)
 {
-	int ret = 0;
-
 	preempt_disable();
-	ret = smp_call_function(func, info, retry, wait);
-	func(info);
+	on_cpu(func, info, retry, wait, cpu_online_map);
 	preempt_enable();
-	return ret;
+	return 0;
 }
 
 /*
@@ -99,8 +126,10 @@
 #define smp_processor_id()			0
 #define hard_smp_processor_id()			0
 #define smp_threads_ready			1
-#define smp_call_function(func,info,retry,wait)	({ 0; })
-#define on_each_cpu(func,info,retry,wait)	({ func(info); 0; })
+#define on_cpu(func,info,retry,wait,map)        ({ func(info); 0; })
+#define on_single_cpu(func,info,retry,wait,cpu) ({ func(info); 0; })
+#define on_other_cpus(func,info,retry,wait)     ({ 0; })
+#define on_each_cpu(func,info,retry,wait)       ({ func(info); 0; })
 static inline void smp_send_reschedule(int cpu) { }
 #define num_booting_cpus()			1
 #define smp_prepare_boot_cpu()			do {} while (0)
diff -urN linux-2.6.5/mm/slab.c linux-2.6.5_smp/mm/slab.c
--- linux-2.6.5/mm/slab.c	2004-04-04 05:37:41.000000000 +0200
+++ linux-2.6.5_smp/mm/slab.c	2004-04-14 14:03:39.000000000 +0200
@@ -51,7 +51,7 @@
  * On SMP, it additionally reduces the spinlock operations.
  *
  * The c_cpuarray may not be read with enabled local interrupts - 
- * it's changed with a smp_call_function().
+ * it's changed with a on_cpu().
  *
  * SMP synchronization:
  *  constructors and destructors are called without any locking.
@@ -1375,6 +1375,7 @@
  */
 static void smp_call_function_all_cpus(void (*func) (void *arg), void *arg)
 {
+	// XXX if order doesn't matter use on_each_cpu()
 	check_irq_on();
 	preempt_disable();
 
@@ -1382,9 +1383,7 @@
 	func(arg);
 	local_irq_enable();
 
-	if (smp_call_function(func, arg, 1, 1))
-		BUG();
-
+	on_other_cpus(func, arg, 1, 1);
 	preempt_enable();
 }
 
diff -urN linux-2.6.5/net/core/flow.c linux-2.6.5_smp/net/core/flow.c
--- linux-2.6.5/net/core/flow.c	2004-04-04 05:36:15.000000000 +0200
+++ linux-2.6.5_smp/net/core/flow.c	2004-04-14 14:06:19.000000000 +0200
@@ -292,7 +292,7 @@
 	init_completion(&info.completion);
 
 	local_bh_disable();
-	smp_call_function(flow_cache_flush_per_cpu, &info, 1, 0);
+	on_other_cpus(flow_cache_flush_per_cpu, &info, 1, 0);
 	flow_cache_flush_tasklet((unsigned long)&info);
 	local_bh_enable();
 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:21 [Patch] SMP call function cleanup Jan Glauber
@ 2004-04-22 12:28 ` William Lee Irwin III
  2004-04-22 12:37   ` Anton Blanchard
  2004-04-22 13:58   ` Jan Glauber
  2004-04-22 12:33 ` Anton Blanchard
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 23+ messages in thread
From: William Lee Irwin III @ 2004-04-22 12:28 UTC (permalink / raw)
  To: Jan Glauber; +Cc: linux-arch, schwidefsky

On Thu, Apr 22, 2004 at 02:21:51PM +0200, Jan Glauber wrote:
> I've been looking into the SMP call function stuff on different
> archs and found many different functions...
> In the common code part there are 2 functions:
> smp_call_function()  	// call a function on all CPUs but my own
> on_each_cpu()		// call a function on all CPUs
> Many archs need an additional function to call a function on a 
> specific CPU:
> arch-s390:
> smp_call_function_on()
> arch-alpha:
> smp_call_function_on_cpu()
> arch-ia64:
> smp_call_function_single()
> On i386 there is no smp_call_function_single() so they have a workaround
> with smp_call_function() and testing for smp_processor_id().
> Finally the slab allocator has its own on_each_cpu() function:

Any chance we could get an smp_call_function_cpumask()?


-- wli

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:28 ` William Lee Irwin III
@ 2004-04-22 12:37   ` Anton Blanchard
  2004-04-22 12:49     ` William Lee Irwin III
                       ` (2 more replies)
  2004-04-22 13:58   ` Jan Glauber
  1 sibling, 3 replies; 23+ messages in thread
From: Anton Blanchard @ 2004-04-22 12:37 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Jan Glauber, linux-arch, schwidefsky


Hi,

> Any chance we could get an smp_call_function_cpumask()?

Unless there is a need Id prefer not to. Some arches do not support IPI
to a cpumask, whereas they should do send to all and send to single.

Anton

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:37   ` Anton Blanchard
@ 2004-04-22 12:49     ` William Lee Irwin III
  2004-04-22 12:59     ` James Bottomley
  2004-04-23  0:05     ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 23+ messages in thread
From: William Lee Irwin III @ 2004-04-22 12:49 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: Jan Glauber, linux-arch, schwidefsky

At some point in the past, I wrote:
>> Any chance we could get an smp_call_function_cpumask()?

On Thu, Apr 22, 2004 at 10:37:03PM +1000, Anton Blanchard wrote:
> Unless there is a need Id prefer not to. Some arches do not support IPI
> to a cpumask, whereas they should do send to all and send to single.

I remember someone making noise about a need and don't remember the
context. I'll just let them take it up when the patches go out. I
actually presumed it wouldn't be natively supported, but rather be a
helper function for more general usage cases, esp. ones where > 1 cpu
is needed, but not all, and hitting every cpu on the system is greatly
more expensive than hitting only the desired cpus, which is about all
I remember of the noise that was made.

-- wli

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:37   ` Anton Blanchard
  2004-04-22 12:49     ` William Lee Irwin III
@ 2004-04-22 12:59     ` James Bottomley
  2004-04-22 13:24       ` William Lee Irwin III
       [not found]       ` <1082641822.1329.45.camel@halo>
  2004-04-23  0:05     ` Benjamin Herrenschmidt
  2 siblings, 2 replies; 23+ messages in thread
From: James Bottomley @ 2004-04-22 12:59 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: William Lee Irwin III, Jan Glauber, linux-arch, schwidefsky

On Thu, 2004-04-22 at 08:37, Anton Blanchard wrote:
> Unless there is a need Id prefer not to. Some arches do not support IPI
> to a cpumask, whereas they should do send to all and send to single.

This is my preference too.  Voyager is one such arch.

If it's added, there certainly has to be some way of ensuring that the
cpumask variant isn't used preferentially to execute on a single cpu .

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:59     ` James Bottomley
@ 2004-04-22 13:24       ` William Lee Irwin III
  2004-04-22 13:46         ` James Bottomley
       [not found]       ` <1082641822.1329.45.camel@halo>
  1 sibling, 1 reply; 23+ messages in thread
From: William Lee Irwin III @ 2004-04-22 13:24 UTC (permalink / raw)
  To: James Bottomley; +Cc: Anton Blanchard, Jan Glauber, linux-arch, schwidefsky

On Thu, 2004-04-22 at 08:37, Anton Blanchard wrote:
>> Unless there is a need Id prefer not to. Some arches do not support IPI
>> to a cpumask, whereas they should do send to all and send to single.

On Thu, Apr 22, 2004 at 08:59:39AM -0400, James Bottomley wrote:
> This is my preference too.  Voyager is one such arch.
> If it's added, there certainly has to be some way of ensuring that the
> cpumask variant isn't used preferentially to execute on a single cpu .

Well, at the moment, anyone in need of IPI'ing > 1 cpu is IPI'ing all,
which is not swift, so bear that in mind. Though given on_one_cpu(), I
suppose they can do:

	for_each_cpu_mask(cpu, foo->mask)
		on_one_cpu(cpu, bar, ...);

which more or less avoids IPI'ing 1024 cpus to run a function on 2 or
whatever they were going on about, so they can likely code it that way.


-- wli

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 13:24       ` William Lee Irwin III
@ 2004-04-22 13:46         ` James Bottomley
  2004-04-22 14:06           ` William Lee Irwin III
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2004-04-22 13:46 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Anton Blanchard, Jan Glauber, linux-arch, schwidefsky

On Thu, 2004-04-22 at 09:24, William Lee Irwin III wrote:
> Well, at the moment, anyone in need of IPI'ing > 1 cpu is IPI'ing all,
> which is not swift, so bear that in mind. Though given on_one_cpu(), I
> suppose they can do:
> 
> 	for_each_cpu_mask(cpu, foo->mask)
> 		on_one_cpu(cpu, bar, ...);
> 
> which more or less avoids IPI'ing 1024 cpus to run a function on 2 or
> whatever they were going on about, so they can likely code it that way.

But the key is 'anyone in need of'.  What I'd like is for you to
demonstrate a need of execute on cpumask before it gets added to the
API.  Murphy's law says that when given a choice people invariably make
the wrong one, so lets not introduce choice into the api unless it's
absolutely necessary.

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 13:46         ` James Bottomley
@ 2004-04-22 14:06           ` William Lee Irwin III
  0 siblings, 0 replies; 23+ messages in thread
From: William Lee Irwin III @ 2004-04-22 14:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: Anton Blanchard, Jan Glauber, linux-arch, schwidefsky

On Thu, 2004-04-22 at 09:24, William Lee Irwin III wrote:
>> 	for_each_cpu_mask(cpu, foo->mask)
>> 		on_one_cpu(cpu, bar, ...);

On Thu, Apr 22, 2004 at 09:46:16AM -0400, James Bottomley wrote:
> But the key is 'anyone in need of'.  What I'd like is for you to
> demonstrate a need of execute on cpumask before it gets added to the
> API.  Murphy's law says that when given a choice people invariably make
> the wrong one, so lets not introduce choice into the api unless it's
> absolutely necessary.

I remembered what cw told me. It was for ia64 repairing the suboptimal
situation where flush_tlb_mm() does the wrong thing and flushes on all
cpus regardless of mm->cpu_vm_mask. So we can drop this rather easily,
as even when it's resolved, that will be pure arch code that will have
the native IPI API's visible to it.

zwane wanted it for some 100% lockless timer code, but I think that's
dead or not getting merged in the near future, so the same applies.

-- wli

^ permalink raw reply	[flat|nested] 23+ messages in thread

[parent not found: <1082641822.1329.45.camel@halo>]

[parent not found: <1082642332.1778.39.camel@mulgrave>]

* Re: [Patch] SMP call function cleanup
       [not found]         ` <1082642332.1778.39.camel@mulgrave>
@ 2004-04-22 14:15           ` Jan Glauber
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Glauber @ 2004-04-22 14:15 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linux Architecture List

On Thu, 2004-04-22 at 15:58, James Bottomley wrote:
> Because to execute on a single cpu it's less efficient than the single
> cpu variant (even with the sparse mask traversal) and the
> smp_call_functions are usually in the critical path, so saving cycles is
> good...

With my patch you would use on_single_cpu:

static inline void on_single_cpu(void (*func) (void *info), void *info,
                                 int nonatomic, int wait, int cpu)
{
        cpumask_t map;

        preempt_disable();
        cpus_clear(map);
        cpu_set(cpu, map);
        on_cpu(func, info, nonatomic, wait, map);
        preempt_enable();
}

so the overhead to the single cpu variant of smp_call_function is 

	cpus_clear(map);
	cpu_set(cpu, map);

on_cpu():
	cpus_and(map, map, cpu_online_map);
	if (cpus_empty(map))
		goto out;

and the for_each loop in on_cpu

I don't know if that overhead is so much that you want to have two smp
call functions.

Jan

---
Jan Glauber
Linux on zSeries Development
IBM Deutschland Entwicklung GmbH
Phone: +49 7031 161911   Mail: jang@de.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:37   ` Anton Blanchard
  2004-04-22 12:49     ` William Lee Irwin III
  2004-04-22 12:59     ` James Bottomley
@ 2004-04-23  0:05     ` Benjamin Herrenschmidt
  2004-04-23  0:21       ` Anton Blanchard
  2 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2004-04-23  0:05 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: William Lee Irwin III, Jan Glauber, Linux Arch list, schwidefsky

On Thu, 2004-04-22 at 22:37, Anton Blanchard wrote:
> Hi,
> 
> > Any chance we could get an smp_call_function_cpumask()?
> 
> Unless there is a need Id prefer not to. Some arches do not support IPI
> to a cpumask, whereas they should do send to all and send to single.

It's easy for those archs to implement it with a send-all and a software
filtering. For the cases where it would be an optimisation to send to
a cpumask, it makes sense to provide that function.

Ben.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-23  0:05     ` Benjamin Herrenschmidt
@ 2004-04-23  0:21       ` Anton Blanchard
  2004-04-23  0:32         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Blanchard @ 2004-04-23  0:21 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: William Lee Irwin III, Jan Glauber, Linux Arch list, schwidefsky

 
> It's easy for those archs to implement it with a send-all and a software
> filtering. For the cases where it would be an optimisation to send to
> a cpumask, it makes sense to provide that function.

Then someone uses it in a performance critical place and my big SMP
ppc64 performance sucks.

We still havent been given a place in generic code where this
optimisation makes sense.

Anton

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-23  0:21       ` Anton Blanchard
@ 2004-04-23  0:32         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2004-04-23  0:32 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: William Lee Irwin III, Jan Glauber, Linux Arch list, schwidefsky

On Fri, 2004-04-23 at 10:21, Anton Blanchard wrote:
>  > It's easy for those archs to implement it with a send-all and a software
> > filtering. For the cases where it would be an optimisation to send to
> > a cpumask, it makes sense to provide that function.
> 
> Then someone uses it in a performance critical place and my big SMP
> ppc64 performance sucks.

Ah ? OpenPIC can nicely IPI to CPU masks iirc, can't xics ? :)-

> We still havent been given a place in generic code where this
> optimisation makes sense.

I had a few cases where I wanted it until I found different ways to
do things (using RCU mostly), like the PTE freeing race. So far, I
avoided it. But I regulary come up with some idea which would
involve such a thing.

Ben.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:28 ` William Lee Irwin III
  2004-04-22 12:37   ` Anton Blanchard
@ 2004-04-22 13:58   ` Jan Glauber
  1 sibling, 0 replies; 23+ messages in thread
From: Jan Glauber @ 2004-04-22 13:58 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Linux Architecture List

On Thu, 2004-04-22 at 14:28, William Lee Irwin III wrote:

> Any chance we could get an smp_call_function_cpumask()?
That's what I propesed with on_cpu(). I don't care how we name this
function, but I found the smp_call_function_on_whatever
names a bit too long.

Jan

---
Jan Glauber
Linux on zSeries Development
IBM Deutschland Entwicklung GmbH
Phone: +49 7031 161911   Mail: jang@de.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:21 [Patch] SMP call function cleanup Jan Glauber
  2004-04-22 12:28 ` William Lee Irwin III
@ 2004-04-22 12:33 ` Anton Blanchard
  2004-04-22 14:00   ` Jan Glauber
  2004-04-23  0:04 ` Benjamin Herrenschmidt
  2004-04-23 23:38 ` David S. Miller
  3 siblings, 1 reply; 23+ messages in thread
From: Anton Blanchard @ 2004-04-22 12:33 UTC (permalink / raw)
  To: Jan Glauber; +Cc: linux-arch, schwidefsky

 
Hi,

> I've been looking into the SMP call function stuff on different
> archs and found many different functions...
> 
> In the common code part there are 2 functions:
> smp_call_function()  	// call a function on all CPUs but my own
> on_each_cpu()		// call a function on all CPUs
> 
> Many archs need an additional function to call a function on a 
> specific CPU:

We noticed this too. Rusty created on_one_cpu below and had a generic
(but costly) implementation so architectures could switch across when they
wanted.

Anton

--

Name: on_one_cpu function
Author: Rusty Russell
Status: Booted on 2.6.5

Similar to on_each_cpu, this implements on_one_cpu.  For archs which
don't do it natively, it's implemented in terms of smp_call_function().

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .21074-linux-2.6.5/include/linux/smp.h .21074-linux-2.6.5.updated/include/linux/smp.h
--- .21074-linux-2.6.5/include/linux/smp.h	2004-03-12 07:57:26.000000000 +1100
+++ .21074-linux-2.6.5.updated/include/linux/smp.h	2004-04-05 16:21:20.000000000 +1000
@@ -8,6 +8,10 @@
 
 #include <linux/config.h>
 
+#define get_cpu()		({ preempt_disable(); smp_processor_id(); })
+#define put_cpu()		preempt_enable()
+#define put_cpu_no_resched()	preempt_enable_no_resched()
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
@@ -69,6 +73,24 @@ static inline int on_each_cpu(void (*fun
 	return ret;
 }
 
+#ifndef __HAVE_ARCH_ON_ONE_CPU
+extern int __on_one_cpu(unsigned cpu, void (*func)(void *info), void *info);
+static inline int on_one_cpu(unsigned cpu,
+			     void (*func)(void *info), void *info)
+{
+	int ret;
+
+	if (cpu == get_cpu()) {
+		func(info);
+		ret = 0;
+	} else
+		ret = __on_one_cpu(cpu, func, info);
+	put_cpu();
+	return ret;
+}
+#endif
+
+
 /*
  * True once the per process idle is forked
  */
@@ -107,8 +129,4 @@ static inline void smp_send_reschedule(i
 
 #endif /* !SMP */
 
-#define get_cpu()		({ preempt_disable(); smp_processor_id(); })
-#define put_cpu()		preempt_enable()
-#define put_cpu_no_resched()	preempt_enable_no_resched()
-
 #endif /* __LINUX_SMP_H */
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .21074-linux-2.6.5/kernel/sched.c .21074-linux-2.6.5.updated/kernel/sched.c
--- .21074-linux-2.6.5/kernel/sched.c	2004-04-05 09:04:48.000000000 +1000
+++ .21074-linux-2.6.5.updated/kernel/sched.c	2004-04-05 16:21:45.000000000 +1000
@@ -631,7 +631,29 @@ void kick_process(task_t *p)
 
 EXPORT_SYMBOL_GPL(kick_process);
 
-#endif
+#ifndef __HAVE_ARCH_ON_ONE_CPU
+struct which_cpu
+{
+	unsigned int cpu;
+	void (*func)(void *info);
+	void *info;
+};
+
+static void maybe_on_cpu(void *_which)
+{
+	struct which_cpu *which = _which;
+
+	if (smp_processor_id() == which->cpu)
+		which->func(which->info);
+}
+
+int __on_one_cpu(unsigned cpu, void (*func)(void *info), void *info)
+{
+	struct which_cpu which = { cpu, info };
+	return smp_call_function(maybe_on_cpu, &which, 1, 1);
+}
+#endif /* __HAVE_ARCH_ON_ONE_CPU */
+#endif /* CONFIG_SMP */
 
 /***
  * try_to_wake_up - wake up a thread

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:33 ` Anton Blanchard
@ 2004-04-22 14:00   ` Jan Glauber
  2004-04-22 14:13     ` William Lee Irwin III
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Glauber @ 2004-04-22 14:00 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: Linux Architecture List

On Thu, 2004-04-22 at 14:33, Anton Blanchard wrote:
>  Hi,
> 
> > I've been looking into the SMP call function stuff on different
> > archs and found many different functions...
> > 
> > In the common code part there are 2 functions:
> > smp_call_function()  	// call a function on all CPUs but my own
> > on_each_cpu()		// call a function on all CPUs
> > 
> > Many archs need an additional function to call a function on a 
> > specific CPU:
> 
> We noticed this too. Rusty created on_one_cpu below and had a generic
> (but costly) implementation so architectures could switch across when they
> wanted.

Hm, why can't you just do a:

for_each_cpu_mask(cpu, mask)
        send_IPI_single(cpu)

Jan

---
Jan Glauber
Linux on zSeries Development
IBM Deutschland Entwicklung GmbH
Phone: +49 7031 161911   Mail: jang@de.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 14:00   ` Jan Glauber
@ 2004-04-22 14:13     ` William Lee Irwin III
  0 siblings, 0 replies; 23+ messages in thread
From: William Lee Irwin III @ 2004-04-22 14:13 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Anton Blanchard, Linux Architecture List

On Thu, 2004-04-22 at 14:33, Anton Blanchard wrote:
>> We noticed this too. Rusty created on_one_cpu below and had a generic
>> (but costly) implementation so architectures could switch across when they
>> wanted.

On Thu, Apr 22, 2004 at 04:00:08PM +0200, Jan Glauber wrote:
> Hm, why can't you just do a:
> for_each_cpu_mask(cpu, mask)
>         send_IPI_single(cpu)

I think they're concerned about API minimality. I just debunked or
whatever both of the feature requests I had for it on my end after
doublechecking on IRC, so there isn't a known immediate need for it.


-- wli

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:21 [Patch] SMP call function cleanup Jan Glauber
  2004-04-22 12:28 ` William Lee Irwin III
  2004-04-22 12:33 ` Anton Blanchard
@ 2004-04-23  0:04 ` Benjamin Herrenschmidt
  2004-04-23 23:38 ` David S. Miller
  3 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2004-04-23  0:04 UTC (permalink / raw)
  To: glauber; +Cc: Linux Arch list, schwidefsky


> there are 3 different kinds of SMP calls:
> 1. all CPUs
> 2. all CPUs but my own
> 3. one CPU
> 
> only _one_ basic function is needed to implement all variants:

Well, I'd go further and allow an arbitraty cpu_mask. There are
cases where that could be an interesting optimisation to be able
to send an IPI to a "set" of CPUs. If the HW can't do it easily,
just send on all and software mask who gets the actual function
call.

Ben.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-22 12:21 [Patch] SMP call function cleanup Jan Glauber
                   ` (2 preceding siblings ...)
  2004-04-23  0:04 ` Benjamin Herrenschmidt
@ 2004-04-23 23:38 ` David S. Miller
  3 siblings, 0 replies; 23+ messages in thread
From: David S. Miller @ 2004-04-23 23:38 UTC (permalink / raw)
  To: glauber; +Cc: linux-arch, schwidefsky


FWIW, sparc64 has a smp_cross_call_masked() internal routine that sends the IPI
to a cpumask_t specified set of cpus.

So whatever is decided, it can easily be implemented on that platform.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
@ 2004-04-23  6:46 Martin Schwidefsky
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Schwidefsky @ 2004-04-23  6:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: glauber, Linux Arch list





> Well, I'd go further and allow an arbitraty cpu_mask. There are
> cases where that could be an interesting optimisation to be able
> to send an IPI to a "set" of CPUs. If the HW can't do it easily,
> just send on all and software mask who gets the actual function
> call.

That is exactly what Jan's patch provides. The on_cpu function takes
an arbitrary cpu_mask. on_each_cpu, on_other_cpus and on_single_cpu
just expand to a on_cpu call with the appropriate mask.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: schwidefsky@de.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
@ 2004-04-23  6:53 Martin Schwidefsky
  2004-04-23  7:05 ` Anton Blanchard
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Schwidefsky @ 2004-04-23  6:53 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Benjamin Herrenschmidt, Jan Glauber, Linux Arch list,
	William Lee Irwin III

> Then someone uses it in a performance critical place and my big SMP
> ppc64 performance sucks.

I still have problem to understand this argument. Either there is the
need to execute a function on a selected set of cpus or there isn't.
If there is the need and the architecture is slow doing IPI's to single
cpus then the architecture can choose to do a IPI to all cpus
and do the mask selection in the IPI interrupt.

> We still havent been given a place in generic code where this
> optimisation makes sense.

Ok, thats a valid point.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: schwidefsky@de.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-23  6:53 Martin Schwidefsky
@ 2004-04-23  7:05 ` Anton Blanchard
  2004-04-23  9:00   ` Russell King
  0 siblings, 1 reply; 23+ messages in thread
From: Anton Blanchard @ 2004-04-23  7:05 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Benjamin Herrenschmidt, Jan Glauber, Linux Arch list,
	William Lee Irwin III

Hi,

> I still have problem to understand this argument. Either there is the
> need to execute a function on a selected set of cpus or there isn't.
> If there is the need and the architecture is slow doing IPI's to single
> cpus then the architecture can choose to do a IPI to all cpus
> and do the mask selection in the IPI interrupt.

The worry is we then start changing algorithms to suit, I could see 
someone replacing a timer based algorithm with an IPI send to cpumask
one. Im theorising here and the IPI method may end up being more
efficient on all architectures but we suffer from not having a decent
example case yet.

Since we tend to avoid creating infrastructure just in case someone
finds a use for it Id prefer to avoid adding it at this stage.

Anton

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
  2004-04-23  7:05 ` Anton Blanchard
@ 2004-04-23  9:00   ` Russell King
  0 siblings, 0 replies; 23+ messages in thread
From: Russell King @ 2004-04-23  9:00 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Martin Schwidefsky, Benjamin Herrenschmidt, Jan Glauber,
	Linux Arch list, William Lee Irwin III

On Fri, Apr 23, 2004 at 05:05:21PM +1000, Anton Blanchard wrote:
> The worry is we then start changing algorithms to suit, I could see 
> someone replacing a timer based algorithm with an IPI send to cpumask
> one. Im theorising here and the IPI method may end up being more
> efficient on all architectures but we suffer from not having a decent
> example case yet.

Indeed and completely agreed.

Also, consider that the sending CPU may not be able to receive the IPI
itself, and should be excluded if that is the case.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Patch] SMP call function cleanup
@ 2004-04-23  7:46 Martin Schwidefsky
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Schwidefsky @ 2004-04-23  7:46 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Benjamin Herrenschmidt, Jan Glauber, Linux Arch list,
	William Lee Irwin III





> The worry is we then start changing algorithms to suit, I could see
> someone replacing a timer based algorithm with an IPI send to cpumask
> one. Im theorising here and the IPI method may end up being more
> efficient on all architectures but we suffer from not having a decent
> example case yet.

We'd need a way to hide the on_cpu function then. What you want is the
old style interface that only allows a call to either all cpus or to
a single one.

> Since we tend to avoid creating infrastructure just in case someone
> finds a use for it Id prefer to avoid adding it at this stage.

I still like the cleanup part of the patch.

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: schwidefsky@de.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2004-04-23 23:39 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-22 12:21 [Patch] SMP call function cleanup Jan Glauber
2004-04-22 12:28 ` William Lee Irwin III
2004-04-22 12:37   ` Anton Blanchard
2004-04-22 12:49     ` William Lee Irwin III
2004-04-22 12:59     ` James Bottomley
2004-04-22 13:24       ` William Lee Irwin III
2004-04-22 13:46         ` James Bottomley
2004-04-22 14:06           ` William Lee Irwin III
     [not found]       ` <1082641822.1329.45.camel@halo>
     [not found]         ` <1082642332.1778.39.camel@mulgrave>
2004-04-22 14:15           ` Jan Glauber
2004-04-23  0:05     ` Benjamin Herrenschmidt
2004-04-23  0:21       ` Anton Blanchard
2004-04-23  0:32         ` Benjamin Herrenschmidt
2004-04-22 13:58   ` Jan Glauber
2004-04-22 12:33 ` Anton Blanchard
2004-04-22 14:00   ` Jan Glauber
2004-04-22 14:13     ` William Lee Irwin III
2004-04-23  0:04 ` Benjamin Herrenschmidt
2004-04-23 23:38 ` David S. Miller
  -- strict thread matches above, loose matches on Subject: below --
2004-04-23  6:46 Martin Schwidefsky
2004-04-23  6:53 Martin Schwidefsky
2004-04-23  7:05 ` Anton Blanchard
2004-04-23  9:00   ` Russell King
2004-04-23  7:46 Martin Schwidefsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox