linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations
@ 2008-07-15 21:14 Mike Travis
  2008-07-15 21:14 ` [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Mike Travis
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel


> From Rusty Russell <rusty@rustcorp.com.au>:
>  
>    Hi Christoph/Mike,
>    
>      Looked at cpumask_of_cpu as introduced in 
>    9f0e8d0400d925c3acd5f4e01dbeb736e4011882 (x86: convert cpumask_of_cpu
>    macro to allocated array), and I don't think it's safe:
>    
>      #define cpumask_of_cpu(cpu)					\
>      (*({								\
>    	typeof(_unused_cpumask_arg_) m;					\
>    	if (sizeof(m) == sizeof(unsigned long)) {			\
>    		m.bits[0] = 1UL<<(cpu);					\
>    	} else {							\
>    		cpus_clear(m);						\
>    		cpu_set((cpu), m);					\
>    	}								\
>    	&m;								\
>      }))
>    
>    Referring to &m once out of scope is invalid, and I can't find any
>    evidence that it's legal here.  In particular, the change 
>    b53e921ba1cff8453dc9a87a84052fa12d5b30bd (generic: reduce stack pressure
>    in sched_affinity) which passes &m to other functions seems highly risky.

  * Patch 1 replaces the dangerous lvalue version of cpumask_of_cpu
    with new cpumask_of_cpu_ptr macros.  These are patterned after the
    node_to_cpumask_ptr macros.

  * Patches 2..6 optimizes various places where a pointer to the cpumask_of_cpu
    value will result in reducing stack pressure.

  * Patch 7 provides a generic set of CPUMASK_ALLOC macros patterned after
    the SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
    variables are declared on the stack to reduce the amount of stack space
    required.

  * Patch 8 uses the CPUMASK_ALLOC macros in the centrino_target() function.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200


Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Greg Banks <gnb@sgi.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul Jackson <pj@sgi.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Dean Nelson <dcn@sgi.com>
Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
---

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-18  5:30   ` Rusty Russell
  2008-07-15 21:14 ` [PATCH 2/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c Mike Travis
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel,
	Len Brown, Dave Jones, Paul Jackson, Tigran Aivazian,
	Robert Richter, Greg Banks, Eric W. Biederman, Adrian Bunk,
	Thomas Gleixner, Andreas Schwab, Johannes Weiner

[-- Attachment #1: replace-cpumask_of_cpu --]
[-- Type: text/plain, Size: 19487 bytes --]

  * This patch replaces the dangerous lvalue version of cpumask_of_cpu
    with new cpumask_of_cpu_ptr macros.  These are patterned after the
    node_to_cpumask_ptr macros.

    In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
    the cpumask_of_cpu_map[cpu] entry is used.  The cpumask_of_cpu_map
    is provided when there is a large NR_CPUS count, reducing
    greatly the amount of code generated and stack space used for
    cpumask_of_cpu().  The pointer to the cpumask_t value is needed for
    calling set_cpus_allowed_ptr() to reduce the amount of stack space
    needed to pass the cpumask_t value.

    If there isn't a cpumask_of_cpu_map[], then a temporary variable is
    declared and filled in with value from cpumask_of_cpu(cpu) as well as
    a pointer variable pointing to this temporary variable.  Afterwards,
    the pointer is used to reference the cpumask value.  The compiler
    will optimize out the extra dereference through the pointer as well
    as the stack space used for the pointer, resulting in identical code.

    A good example of the orthogonal usages is in net/sunrpc/svc.c:

	case SVC_POOL_PERCPU:
	{
		unsigned int cpu = m->pool_to[pidx];
		cpumask_of_cpu_ptr(cpumask, cpu);

		*oldmask = current->cpus_allowed;
		set_cpus_allowed_ptr(current, cpumask);
		return 1;
	}
	case SVC_POOL_PERNODE:
	{
		unsigned int node = m->pool_to[pidx];
		node_to_cpumask_ptr(nodecpumask, node);

		*oldmask = current->cpus_allowed;
		set_cpus_allowed_ptr(current, nodecpumask);
		return 1;
	}

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>

* ACPI
Cc: Len Brown <len.brown@intel.com>

* CPUFREQ
Cc: Dave Jones <davej@codemonkey.org.uk>

* CPUMASK
Cc: Paul Jackson <pj@sgi.com>

* MICROCODE
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>

* PROFILE
Cc: Robert Richter <robert.richter@amd.com>

* SUNRPC
Cc: Greg Banks <gnb@sgi.com>

* Various
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andreas Schwab <schwab@suse.de>
Cc: Johannes Weiner <hannes@saeurebad.de>

---

    Unfortunately, I cannot break down this patch into separate pieces
    as reverting the cpumask_of_cpu() macro back to it's prior form will
    result in compile errors when referenced as an lvalue.

	One error from checkpatch.pl:

	ERROR: Macros with complex values should be enclosed in parenthesis
	#472: FILE: include/linux/cpumask.h:253:
	+#define cpumask_of_cpu_ptr_next(v, cpu) \
	+                               v = &cpumask_of_cpu(cpu)
---
 arch/x86/kernel/acpi/cstate.c                    |    3 +-
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c       |   10 +++++--
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c        |   15 +++++++---
 arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c |    9 ++++--
 arch/x86/kernel/cpu/cpufreq/speedstep-ich.c      |    3 +-
 arch/x86/kernel/cpu/intel_cacheinfo.c            |    3 +-
 arch/x86/kernel/microcode.c                      |   13 ++++++---
 arch/x86/kernel/reboot.c                         |   14 ++++++----
 drivers/acpi/processor_throttling.c              |   11 +++++--
 drivers/firmware/dcdbas.c                        |    3 +-
 include/linux/cpumask.h                          |   32 +++++++++++++++++++----
 kernel/stop_machine.c                            |    3 +-
 kernel/trace/trace_sysprof.c                     |    4 ++
 net/sunrpc/svc.c                                 |    3 +-
 14 files changed, 91 insertions(+), 35 deletions(-)

--- linux-2.6.tip.orig/arch/x86/kernel/acpi/cstate.c
+++ linux-2.6.tip/arch/x86/kernel/acpi/cstate.c
@@ -73,6 +73,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 
 	cpumask_t saved_mask;
+	cpumask_of_cpu_ptr(new_mask, cpu);
 	int retval;
 	unsigned int eax, ebx, ecx, edx;
 	unsigned int edx_part;
@@ -91,7 +92,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 
 	/* Make sure we are running on right CPU */
 	saved_mask = current->cpus_allowed;
-	retval = set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed_ptr(current, new_mask);
 	if (retval)
 		return -1;
 
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -200,10 +200,12 @@ static void drv_read(struct drv_cmd *cmd
 static void drv_write(struct drv_cmd *cmd)
 {
 	cpumask_t saved_mask = current->cpus_allowed;
+	cpumask_of_cpu_ptr_declare(cpu_mask);
 	unsigned int i;
 
 	for_each_cpu_mask_nr(i, cmd->mask) {
-		set_cpus_allowed_ptr(current, &cpumask_of_cpu(i));
+		cpumask_of_cpu_ptr_next(cpu_mask, i);
+		set_cpus_allowed_ptr(current, cpu_mask);
 		do_drv_write(cmd);
 	}
 
@@ -267,11 +269,12 @@ static unsigned int get_measured_perf(un
 	} aperf_cur, mperf_cur;
 
 	cpumask_t saved_mask;
+	cpumask_of_cpu_ptr(cpu_mask, cpu);
 	unsigned int perf_percent;
 	unsigned int retval;
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	set_cpus_allowed_ptr(current, cpu_mask);
 	if (get_cpu() != cpu) {
 		/* We were not able to run on requested processor */
 		put_cpu();
@@ -337,6 +340,7 @@ static unsigned int get_measured_perf(un
 
 static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
 {
+	cpumask_of_cpu_ptr(cpu_mask, cpu);
 	struct acpi_cpufreq_data *data = per_cpu(drv_data, cpu);
 	unsigned int freq;
 	unsigned int cached_freq;
@@ -349,7 +353,7 @@ static unsigned int get_cur_freq_on_cpu(
 	}
 
 	cached_freq = data->freq_table[data->acpi_data->state].frequency;
-	freq = extract_freq(get_cur_val(&cpumask_of_cpu(cpu)), data);
+	freq = extract_freq(get_cur_val(cpu_mask), data);
 	if (freq != cached_freq) {
 		/*
 		 * The dreaded BIOS frequency change behind our back.
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -479,11 +479,12 @@ static int core_voltage_post_transition(
 static int check_supported_cpu(unsigned int cpu)
 {
 	cpumask_t oldmask;
+	cpumask_of_cpu_ptr(cpu_mask, cpu);
 	u32 eax, ebx, ecx, edx;
 	unsigned int rc = 0;
 
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	set_cpus_allowed_ptr(current, cpu_mask);
 
 	if (smp_processor_id() != cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", cpu);
@@ -1016,6 +1017,7 @@ static int transition_frequency_pstate(s
 static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsigned relation)
 {
 	cpumask_t oldmask;
+	cpumask_of_cpu_ptr(cpu_mask, pol->cpu);
 	struct powernow_k8_data *data = per_cpu(powernow_data, pol->cpu);
 	u32 checkfid;
 	u32 checkvid;
@@ -1030,7 +1032,7 @@ static int powernowk8_target(struct cpuf
 
 	/* only run on specific CPU from here on */
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(pol->cpu));
+	set_cpus_allowed_ptr(current, cpu_mask);
 
 	if (smp_processor_id() != pol->cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1105,6 +1107,7 @@ static int __cpuinit powernowk8_cpu_init
 {
 	struct powernow_k8_data *data;
 	cpumask_t oldmask;
+	cpumask_of_cpu_ptr_declare(newmask);
 	int rc;
 
 	if (!cpu_online(pol->cpu))
@@ -1156,7 +1159,8 @@ static int __cpuinit powernowk8_cpu_init
 
 	/* only run on specific CPU from here on */
 	oldmask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(pol->cpu));
+	cpumask_of_cpu_ptr_next(newmask, pol->cpu);
+	set_cpus_allowed_ptr(current, newmask);
 
 	if (smp_processor_id() != pol->cpu) {
 		printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1178,7 +1182,7 @@ static int __cpuinit powernowk8_cpu_init
 	set_cpus_allowed_ptr(current, &oldmask);
 
 	if (cpu_family == CPU_HW_PSTATE)
-		pol->cpus = cpumask_of_cpu(pol->cpu);
+		pol->cpus = *newmask;
 	else
 		pol->cpus = per_cpu(cpu_core_map, pol->cpu);
 	data->available_cores = &(pol->cpus);
@@ -1244,6 +1248,7 @@ static unsigned int powernowk8_get (unsi
 {
 	struct powernow_k8_data *data;
 	cpumask_t oldmask = current->cpus_allowed;
+	cpumask_of_cpu_ptr(newmask, cpu);
 	unsigned int khz = 0;
 	unsigned int first;
 
@@ -1253,7 +1258,7 @@ static unsigned int powernowk8_get (unsi
 	if (!data)
 		return -EINVAL;
 
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	set_cpus_allowed_ptr(current, newmask);
 	if (smp_processor_id() != cpu) {
 		printk(KERN_ERR PFX
 			"limiting to CPU %d failed in powernowk8_get\n", cpu);
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
@@ -406,9 +406,10 @@ static unsigned int get_cur_freq(unsigne
 	unsigned l, h;
 	unsigned clock_freq;
 	cpumask_t saved_mask;
+	cpumask_of_cpu_ptr(new_mask, cpu);
 
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	set_cpus_allowed_ptr(current, new_mask);
 	if (smp_processor_id() != cpu)
 		return 0;
 
@@ -647,9 +648,11 @@ static int centrino_target (struct cpufr
 		 */
 
 		if (!cpus_empty(covered_cpus)) {
+			cpumask_of_cpu_ptr_declare(new_mask);
+
 			for_each_cpu_mask_nr(j, covered_cpus) {
-				set_cpus_allowed_ptr(current,
-						     &cpumask_of_cpu(j));
+				cpumask_of_cpu_ptr_next(new_mask, j);
+				set_cpus_allowed_ptr(current, new_mask);
 				wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
 			}
 		}
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
@@ -244,7 +244,8 @@ static unsigned int _speedstep_get(const
 
 static unsigned int speedstep_get(unsigned int cpu)
 {
-	return _speedstep_get(&cpumask_of_cpu(cpu));
+	cpumask_of_cpu_ptr(newmask, cpu);
+	return _speedstep_get(newmask);
 }
 
 /**
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -516,6 +516,7 @@ static int __cpuinit detect_cache_attrib
 	unsigned long		j;
 	int			retval;
 	cpumask_t		oldmask;
+	cpumask_of_cpu_ptr(newmask, cpu);
 
 	if (num_cache_leaves == 0)
 		return -ENOENT;
@@ -526,7 +527,7 @@ static int __cpuinit detect_cache_attrib
 		return -ENOMEM;
 
 	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	retval = set_cpus_allowed_ptr(current, newmask);
 	if (retval)
 		goto out;
 
--- linux-2.6.tip.orig/arch/x86/kernel/microcode.c
+++ linux-2.6.tip/arch/x86/kernel/microcode.c
@@ -388,6 +388,7 @@ static int do_microcode_update (void)
 	void *new_mc = NULL;
 	int cpu;
 	cpumask_t old;
+	cpumask_of_cpu_ptr_declare(newmask);
 
 	old = current->cpus_allowed;
 
@@ -404,7 +405,8 @@ static int do_microcode_update (void)
 
 			if (!uci->valid)
 				continue;
-			set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+			cpumask_of_cpu_ptr_next(newmask, cpu);
+			set_cpus_allowed_ptr(current, newmask);
 			error = get_maching_microcode(new_mc, cpu);
 			if (error < 0)
 				goto out;
@@ -574,6 +576,7 @@ static int apply_microcode_check_cpu(int
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 	struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
 	cpumask_t old;
+	cpumask_of_cpu_ptr(newmask, cpu);
 	unsigned int val[2];
 	int err = 0;
 
@@ -582,7 +585,7 @@ static int apply_microcode_check_cpu(int
 		return 0;
 
 	old = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	set_cpus_allowed_ptr(current, newmask);
 
 	/* Check if the microcode we have in memory matches the CPU */
 	if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6 ||
@@ -620,11 +623,12 @@ static int apply_microcode_check_cpu(int
 static void microcode_init_cpu(int cpu, int resume)
 {
 	cpumask_t old;
+	cpumask_of_cpu_ptr(newmask, cpu);
 	struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
 
 	old = current->cpus_allowed;
 
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+	set_cpus_allowed_ptr(current, newmask);
 	mutex_lock(&microcode_mutex);
 	collect_cpu_info(cpu);
 	if (uci->valid && system_state == SYSTEM_RUNNING && !resume)
@@ -656,11 +660,12 @@ static ssize_t reload_store(struct sys_d
 		return -EINVAL;
 	if (val == 1) {
 		cpumask_t old;
+		cpumask_of_cpu_ptr(newmask, cpu);
 
 		old = current->cpus_allowed;
 
 		get_online_cpus();
-		set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+		set_cpus_allowed_ptr(current, newmask);
 
 		mutex_lock(&microcode_mutex);
 		if (uci->valid)
--- linux-2.6.tip.orig/arch/x86/kernel/reboot.c
+++ linux-2.6.tip/arch/x86/kernel/reboot.c
@@ -403,24 +403,28 @@ void native_machine_shutdown(void)
 {
 	/* Stop the cpus and apics */
 #ifdef CONFIG_SMP
-	int reboot_cpu_id;
 
 	/* The boot cpu is always logical cpu 0 */
-	reboot_cpu_id = 0;
+	int reboot_cpu_id = 0;
+	cpumask_of_cpu_ptr(newmask, reboot_cpu_id);
 
 #ifdef CONFIG_X86_32
 	/* See if there has been given a command line override */
 	if ((reboot_cpu != -1) && (reboot_cpu < NR_CPUS) &&
-		cpu_online(reboot_cpu))
+		cpu_online(reboot_cpu)) {
 		reboot_cpu_id = reboot_cpu;
+		cpumask_of_cpu_ptr_next(newmask, reboot_cpu_id);
+	}
 #endif
 
 	/* Make certain the cpu I'm about to reboot on is online */
-	if (!cpu_online(reboot_cpu_id))
+	if (!cpu_online(reboot_cpu_id)) {
 		reboot_cpu_id = smp_processor_id();
+		cpumask_of_cpu_ptr_next(newmask, reboot_cpu_id);
+	}
 
 	/* Make certain I only run on the appropriate processor */
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(reboot_cpu_id));
+	set_cpus_allowed_ptr(current, newmask);
 
 	/* O.K Now that I'm on the appropriate processor,
 	 * stop all of the others.
--- linux-2.6.tip.orig/drivers/acpi/processor_throttling.c
+++ linux-2.6.tip/drivers/acpi/processor_throttling.c
@@ -827,6 +827,7 @@ static int acpi_processor_get_throttling
 static int acpi_processor_get_throttling(struct acpi_processor *pr)
 {
 	cpumask_t saved_mask;
+	cpumask_of_cpu_ptr_declare(new_mask);
 	int ret;
 
 	if (!pr)
@@ -838,7 +839,8 @@ static int acpi_processor_get_throttling
 	 * Migrate task to the cpu pointed by pr.
 	 */
 	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
+	cpumask_of_cpu_ptr_next(new_mask, pr->id);
+	set_cpus_allowed_ptr(current, new_mask);
 	ret = pr->throttling.acpi_processor_get_throttling(pr);
 	/* restore the previous state */
 	set_cpus_allowed_ptr(current, &saved_mask);
@@ -987,6 +989,7 @@ static int acpi_processor_set_throttling
 int acpi_processor_set_throttling(struct acpi_processor *pr, int state)
 {
 	cpumask_t saved_mask;
+	cpumask_of_cpu_ptr_declare(new_mask);
 	int ret = 0;
 	unsigned int i;
 	struct acpi_processor *match_pr;
@@ -1025,7 +1028,8 @@ int acpi_processor_set_throttling(struct
 	 * it can be called only for the cpu pointed by pr.
 	 */
 	if (p_throttling->shared_type == DOMAIN_COORD_TYPE_SW_ANY) {
-		set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
+		cpumask_of_cpu_ptr_next(new_mask, pr->id);
+		set_cpus_allowed_ptr(current, new_mask);
 		ret = p_throttling->acpi_processor_set_throttling(pr,
 						t_state.target_state);
 	} else {
@@ -1056,7 +1060,8 @@ int acpi_processor_set_throttling(struct
 				continue;
 			}
 			t_state.cpu = i;
-			set_cpus_allowed_ptr(current, &cpumask_of_cpu(i));
+			cpumask_of_cpu_ptr_next(new_mask, i);
+			set_cpus_allowed_ptr(current, new_mask);
 			ret = match_pr->throttling.
 				acpi_processor_set_throttling(
 				match_pr, t_state.target_state);
--- linux-2.6.tip.orig/drivers/firmware/dcdbas.c
+++ linux-2.6.tip/drivers/firmware/dcdbas.c
@@ -254,6 +254,7 @@ static ssize_t host_control_on_shutdown_
 static int smi_request(struct smi_cmd *smi_cmd)
 {
 	cpumask_t old_mask;
+	cpumask_of_cpu_ptr(new_mask, 0);
 	int ret = 0;
 
 	if (smi_cmd->magic != SMI_CMD_MAGIC) {
@@ -264,7 +265,7 @@ static int smi_request(struct smi_cmd *s
 
 	/* SMI requires CPU 0 */
 	old_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu(0));
+	set_cpus_allowed_ptr(current, new_mask);
 	if (smp_processor_id() != 0) {
 		dev_dbg(&dcdbas_pdev->dev, "%s: failed to get CPU 0\n",
 			__func__);
--- linux-2.6.tip.orig/include/linux/cpumask.h
+++ linux-2.6.tip/include/linux/cpumask.h
@@ -62,6 +62,15 @@
  * int next_cpu_nr(cpu, mask)		Next cpu past 'cpu', or nr_cpu_ids
  *
  * cpumask_t cpumask_of_cpu(cpu)	Return cpumask with bit 'cpu' set
+ *ifdef CONFIG_HAS_CPUMASK_OF_CPU
+ * cpumask_of_cpu_ptr_declare(v)	Declares cpumask_t *v
+ * cpumask_of_cpu_ptr_next(v, cpu)	Sets v = &cpumask_of_cpu_map[cpu]
+ * cpumask_of_cpu_ptr(v, cpu)		Combines above two operations
+ *else
+ * cpumask_of_cpu_ptr_declare(v)	Declares cpumask_t _v and *v = &_v
+ * cpumask_of_cpu_ptr_next(v, cpu)	Sets _v = cpumask_of_cpu(cpu)
+ * cpumask_of_cpu_ptr(v, cpu)		Combines above two operations
+ *endif
  * CPU_MASK_ALL				Initializer - all bits set
  * CPU_MASK_NONE			Initializer - no bits set
  * unsigned long *cpus_addr(mask)	Array of unsigned long's in mask
@@ -236,11 +245,16 @@ static inline void __cpus_shift_left(cpu
 
 #ifdef CONFIG_HAVE_CPUMASK_OF_CPU_MAP
 extern cpumask_t *cpumask_of_cpu_map;
-#define cpumask_of_cpu(cpu)    (cpumask_of_cpu_map[cpu])
-
+#define cpumask_of_cpu(cpu)	(cpumask_of_cpu_map[cpu])
+#define	cpumask_of_cpu_ptr(v, cpu)					\
+		const cpumask_t *v = &cpumask_of_cpu(cpu)
+#define	cpumask_of_cpu_ptr_declare(v)					\
+		const cpumask_t *v
+#define cpumask_of_cpu_ptr_next(v, cpu)					\
+					v = &cpumask_of_cpu(cpu)
 #else
 #define cpumask_of_cpu(cpu)						\
-(*({									\
+({									\
 	typeof(_unused_cpumask_arg_) m;					\
 	if (sizeof(m) == sizeof(unsigned long)) {			\
 		m.bits[0] = 1UL<<(cpu);					\
@@ -248,8 +262,16 @@ extern cpumask_t *cpumask_of_cpu_map;
 		cpus_clear(m);						\
 		cpu_set((cpu), m);					\
 	}								\
-	&m;								\
-}))
+	m;								\
+})
+#define	cpumask_of_cpu_ptr(v, cpu) 					\
+		cpumask_t _##v = cpumask_of_cpu(cpu);			\
+		const cpumask_t *v = &_##v
+#define	cpumask_of_cpu_ptr_declare(v)					\
+		cpumask_t _##v;						\
+		const cpumask_t *v = &_##v
+#define cpumask_of_cpu_ptr_next(v, cpu)					\
+					_##v = cpumask_of_cpu(cpu)
 #endif
 
 #define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)
--- linux-2.6.tip.orig/kernel/stop_machine.c
+++ linux-2.6.tip/kernel/stop_machine.c
@@ -33,8 +33,9 @@ static int stopmachine(void *cpu)
 {
 	int irqs_disabled = 0;
 	int prepared = 0;
+	cpumask_of_cpu_ptr(cpumask, (int)(long)cpu);
 
-	set_cpus_allowed_ptr(current, &cpumask_of_cpu((int)(long)cpu));
+	set_cpus_allowed_ptr(current, cpumask);
 
 	/* Ack: we are alive */
 	smp_mb(); /* Theoretically the ack = 0 might not be on this CPU yet. */
--- linux-2.6.tip.orig/kernel/trace/trace_sysprof.c
+++ linux-2.6.tip/kernel/trace/trace_sysprof.c
@@ -213,7 +213,9 @@ static void start_stack_timers(void)
 	int cpu;
 
 	for_each_online_cpu(cpu) {
-		set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+		cpumask_of_cpu_ptr(new_mask, cpu);
+
+		set_cpus_allowed_ptr(current, new_mask);
 		start_stack_timer(cpu);
 	}
 	set_cpus_allowed_ptr(current, &saved_mask);
--- linux-2.6.tip.orig/net/sunrpc/svc.c
+++ linux-2.6.tip/net/sunrpc/svc.c
@@ -314,9 +314,10 @@ svc_pool_map_set_cpumask(unsigned int pi
 	case SVC_POOL_PERCPU:
 	{
 		unsigned int cpu = m->pool_to[pidx];
+		cpumask_of_cpu_ptr(cpumask, cpu);
 
 		*oldmask = current->cpus_allowed;
-		set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
+		set_cpus_allowed_ptr(current, cpumask);
 		return 1;
 	}
 	case SVC_POOL_PERNODE:

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
  2008-07-15 21:14 ` [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-15 21:14 ` [PATCH 3/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c Mike Travis
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel,
	Eric W. Biederman

[-- Attachment #1: optimize-cpu_mask_of_cpu-in-arch_x86_kernel_io_apic_64_c --]
[-- Type: text/plain, Size: 1091 bytes --]

  * Optimize various places where a pointer to the cpumask_of_cpu value
    will result in reducing stack pressure.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200


Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
---
 arch/x86/kernel/io_apic_64.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- linux-2.6.tip.orig/arch/x86/kernel/io_apic_64.c
+++ linux-2.6.tip/arch/x86/kernel/io_apic_64.c
@@ -1372,12 +1372,10 @@ static unsigned int startup_ioapic_irq(u
 static int ioapic_retrigger_irq(unsigned int irq)
 {
 	struct irq_cfg *cfg = &irq_cfg[irq];
-	cpumask_t mask;
 	unsigned long flags;
 
 	spin_lock_irqsave(&vector_lock, flags);
-	mask = cpumask_of_cpu(first_cpu(cfg->domain));
-	send_IPI_mask(mask, cfg->vector);
+	send_IPI_mask(cpumask_of_cpu(first_cpu(cfg->domain)), cfg->vector);
 	spin_unlock_irqrestore(&vector_lock, flags);
 
 	return 1;

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
  2008-07-15 21:14 ` [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Mike Travis
  2008-07-15 21:14 ` [PATCH 2/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-15 21:14 ` [PATCH 4/8] cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c Mike Travis
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel

[-- Attachment #1: optimize-cpu_mask_of_cpu-in-arch_x86_kernel_ldt_c --]
[-- Type: text/plain, Size: 1045 bytes --]

  * Optimize various places where a pointer to the cpumask_of_cpu value
    will result in reducing stack pressure.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/ldt.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.tip.orig/arch/x86/kernel/ldt.c
+++ linux-2.6.tip/arch/x86/kernel/ldt.c
@@ -62,12 +62,12 @@ static int alloc_ldt(mm_context_t *pc, i
 
 	if (reload) {
 #ifdef CONFIG_SMP
-		cpumask_t mask;
+		cpumask_of_cpu_ptr_declare(mask);
 
 		preempt_disable();
 		load_LDT(pc);
-		mask = cpumask_of_cpu(smp_processor_id());
-		if (!cpus_equal(current->mm->cpu_vm_mask, mask))
+		cpumask_of_cpu_ptr_next(mask, smp_processor_id());
+		if (!cpus_equal(current->mm->cpu_vm_mask, *mask))
 			smp_call_function(flush_ldt, current->mm, 1);
 		preempt_enable();
 #else

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 4/8] cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
                   ` (2 preceding siblings ...)
  2008-07-15 21:14 ` [PATCH 3/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-16 11:17   ` Dean Nelson
  2008-07-15 21:14 ` [PATCH 5/8] cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c Mike Travis
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel,
	Dean Nelson

[-- Attachment #1: optimize-cpu_mask_of_cpu-in-drivers_misc_sgi-xp_xpc_main_c --]
[-- Type: text/plain, Size: 1080 bytes --]

  * Optimize various places where a pointer to the cpumask_of_cpu value
    will result in reducing stack pressure.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Dean Nelson <dcn@sgi.com>
---
 drivers/misc/sgi-xp/xpc_main.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-2.6.tip.orig/drivers/misc/sgi-xp/xpc_main.c
+++ linux-2.6.tip/drivers/misc/sgi-xp/xpc_main.c
@@ -229,10 +229,11 @@ xpc_hb_checker(void *ignore)
 	int last_IRQ_count = 0;
 	int new_IRQ_count;
 	int force_IRQ = 0;
+	cpumask_of_cpu_ptr(cpumask, XPC_HB_CHECK_CPU);
 
 	/* this thread was marked active by xpc_hb_init() */
 
-	set_cpus_allowed(current, cpumask_of_cpu(XPC_HB_CHECK_CPU));
+	set_cpus_allowed_ptr(current, cpumask);
 
 	/* set our heartbeating to other partitions into motion */
 	xpc_hb_check_timeout = jiffies + (xpc_hb_check_interval * HZ);

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 5/8] cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
                   ` (3 preceding siblings ...)
  2008-07-15 21:14 ` [PATCH 4/8] cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-15 21:14 ` [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c Mike Travis
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel,
	Thomas Gleixner

[-- Attachment #1: optimize-cpu_mask_of_cpu-in-kernel_time_tick-common_c --]
[-- Type: text/plain, Size: 2302 bytes --]

  * Optimize various places where a pointer to the cpumask_of_cpu value
    will result in reducing stack pressure.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/tick-common.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- linux-2.6.tip.orig/kernel/time/tick-common.c
+++ linux-2.6.tip/kernel/time/tick-common.c
@@ -135,7 +135,7 @@ void tick_setup_periodic(struct clock_ev
  */
 static void tick_setup_device(struct tick_device *td,
 			      struct clock_event_device *newdev, int cpu,
-			      cpumask_t cpumask)
+			      const cpumask_t *cpumask)
 {
 	ktime_t next_event;
 	void (*handler)(struct clock_event_device *) = NULL;
@@ -169,8 +169,8 @@ static void tick_setup_device(struct tic
 	 * When the device is not per cpu, pin the interrupt to the
 	 * current cpu:
 	 */
-	if (!cpus_equal(newdev->cpumask, cpumask))
-		irq_set_affinity(newdev->irq, cpumask);
+	if (!cpus_equal(newdev->cpumask, *cpumask))
+		irq_set_affinity(newdev->irq, *cpumask);
 
 	/*
 	 * When global broadcasting is active, check if the current
@@ -196,20 +196,20 @@ static int tick_check_new_device(struct 
 	struct tick_device *td;
 	int cpu, ret = NOTIFY_OK;
 	unsigned long flags;
-	cpumask_t cpumask;
+	cpumask_of_cpu_ptr_declare(cpumask);
 
 	spin_lock_irqsave(&tick_device_lock, flags);
 
 	cpu = smp_processor_id();
+	cpumask_of_cpu_ptr_next(cpumask, cpu);
 	if (!cpu_isset(cpu, newdev->cpumask))
 		goto out_bc;
 
 	td = &per_cpu(tick_cpu_device, cpu);
 	curdev = td->evtdev;
-	cpumask = cpumask_of_cpu(cpu);
 
 	/* cpu local device ? */
-	if (!cpus_equal(newdev->cpumask, cpumask)) {
+	if (!cpus_equal(newdev->cpumask, *cpumask)) {
 
 		/*
 		 * If the cpu affinity of the device interrupt can not
@@ -222,7 +222,7 @@ static int tick_check_new_device(struct 
 		 * If we have a cpu local device already, do not replace it
 		 * by a non cpu local device
 		 */
-		if (curdev && cpus_equal(curdev->cpumask, cpumask))
+		if (curdev && cpus_equal(curdev->cpumask, *cpumask))
 			goto out_bc;
 	}
 

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
                   ` (4 preceding siblings ...)
  2008-07-15 21:14 ` [PATCH 5/8] cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-18 20:33   ` Ingo Molnar
  2008-07-15 21:14 ` [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros Mike Travis
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel

[-- Attachment #1: optimize-cpu_mask_of_cpu-in-lib_smp_processor_id_c --]
[-- Type: text/plain, Size: 1187 bytes --]

  * Optimize various places where a pointer to the cpumask_of_cpu value
    will result in reducing stack pressure.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>
---
 lib/smp_processor_id.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.tip.orig/lib/smp_processor_id.c
+++ linux-2.6.tip/lib/smp_processor_id.c
@@ -11,7 +11,7 @@ notrace unsigned int debug_smp_processor
 {
 	unsigned long preempt_count = preempt_count();
 	int this_cpu = raw_smp_processor_id();
-	cpumask_t this_mask;
+	cpumask_of_cpu_ptr_declare(this_mask);
 
 	if (likely(preempt_count))
 		goto out;
@@ -23,9 +23,9 @@ notrace unsigned int debug_smp_processor
 	 * Kernel threads bound to a single CPU can safely use
 	 * smp_processor_id():
 	 */
-	this_mask = cpumask_of_cpu(this_cpu);
+	cpumask_of_cpu_ptr_next(this_mask, cpu);
 
-	if (cpus_equal(current->cpus_allowed, this_mask))
+	if (cpus_equal(current->cpus_allowed, *this_mask))
 		goto out;
 
 	/*

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
                   ` (5 preceding siblings ...)
  2008-07-15 21:14 ` [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-16  6:41   ` Bert Wesarg
  2008-07-15 21:14 ` [PATCH 8/8] cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target Mike Travis
  2008-07-18 20:04 ` [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Ingo Molnar
  8 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel,
	Paul Jackson

[-- Attachment #1: add-CPUMASK_ALLOC --]
[-- Type: text/plain, Size: 2043 bytes --]

  * Provide a generic set of CPUMASK_ALLOC macros patterned after the
    SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
    variables are declared on the stack to reduce the amount of stack
    space required.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Paul Jackson <pj@sgi.com>
---
 include/linux/cpumask.h |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

--- linux-2.6.tip.orig/include/linux/cpumask.h
+++ linux-2.6.tip/include/linux/cpumask.h
@@ -75,6 +75,17 @@
  * CPU_MASK_NONE			Initializer - no bits set
  * unsigned long *cpus_addr(mask)	Array of unsigned long's in mask
  *
+ *if NR_CPUS > BITS_PER_LONG
+ *   CPUMASK_ALLOC(m)			Declares and allocates struct m *m =
+ *					   (struct m *)kmalloc(sizeof(*m), ...)
+ *   CPUMASK_FREE(m)			Macro for kfree(v)
+ *else
+ *   CPUMASK_ALLOC(m)			Declares struct m _m, *m = &_m
+ *   CPUMASK_FREE(m)			Nop
+ *endif
+ *   CPUMASK_VAR(v, m)			Declares cpumask_t *v =
+ *						m + offset(struct m, v)
+ *
  * int cpumask_scnprintf(buf, len, mask) Format cpumask for printing
  * int cpumask_parse_user(ubuf, ulen, mask)	Parse ascii string as cpumask
  * int cpulist_scnprintf(buf, len, mask) Format cpumask as list for printing
@@ -311,6 +322,16 @@ extern cpumask_t cpu_mask_all;
 
 #define cpus_addr(src) ((src).bits)
 
+#if NR_CPUS > BITS_PER_LONG
+#define	CPUMASK_ALLOC(m)	struct m *m = kmalloc(sizeof(*m), GFP_KERNEL)
+#define	CPUMASK_FREE(m)		kfree(m)
+#else
+#define	CPUMASK_ALLOC(m)	struct allmasks _m, *m = &_m
+#define	CPUMASK_FREE(m)
+#endif
+#define	CPUMASK_VAR(v, m) 	cpumask_t *v = (cpumask_t *)		\
+				((unsigned long)(m) + offsetof(struct m, v))
+
 #define cpumask_scnprintf(buf, len, src) \
 			__cpumask_scnprintf((buf), (len), &(src), NR_CPUS)
 static inline int __cpumask_scnprintf(char *buf, int len,

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 8/8] cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
                   ` (6 preceding siblings ...)
  2008-07-15 21:14 ` [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros Mike Travis
@ 2008-07-15 21:14 ` Mike Travis
  2008-07-18 20:04 ` [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Ingo Molnar
  8 siblings, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-15 21:14 UTC (permalink / raw)
  To: Rusty Russell, Ingo Molnar, Andrew Morton
  Cc: H. Peter Anvin, Christoph Lameter, Jack Steiner, linux-kernel,
	Adrian Bunk, Venki Pallipadi, Paul Jackson

[-- Attachment #1: optimize-cpumasks-in-arch_x86_kernel_cpu_cpufreq_speedstep-centrino_c --]
[-- Type: text/plain, Size: 5182 bytes --]

  * Use the CPUMASK_ALLOC macros in the centrino_target() function.

Based on linux-2.6.tip/master at the following commit:

    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
    Merge: 9a635fa... 724dce0...
    Author: Ingo Molnar <mingo@elte.hu>
    Date:   Tue Jul 15 14:55:17 2008 +0200

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Paul Jackson <pj@sgi.com>
---
 arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c |   73 ++++++++++++++---------
 1 file changed, 45 insertions(+), 28 deletions(-)

--- linux-2.6.tip.orig/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
@@ -535,6 +535,13 @@ static int centrino_verify (struct cpufr
  *
  * Sets a new CPUFreq policy.
  */
+struct allmasks {
+	cpumask_t		online_policy_cpus;
+	cpumask_t		saved_mask;
+	cpumask_t		set_mask;
+	cpumask_t		covered_cpus;
+};
+
 static int centrino_target (struct cpufreq_policy *policy,
 			    unsigned int target_freq,
 			    unsigned int relation)
@@ -542,48 +549,55 @@ static int centrino_target (struct cpufr
 	unsigned int    newstate = 0;
 	unsigned int	msr, oldmsr = 0, h = 0, cpu = policy->cpu;
 	struct cpufreq_freqs	freqs;
-	cpumask_t		online_policy_cpus;
-	cpumask_t		saved_mask;
-	cpumask_t		set_mask;
-	cpumask_t		covered_cpus;
 	int			retval = 0;
 	unsigned int		j, k, first_cpu, tmp;
-
-	if (unlikely(centrino_model[cpu] == NULL))
-		return -ENODEV;
+	CPUMASK_ALLOC(allmasks);
+	CPUMASK_VAR(online_policy_cpus, allmasks);
+	CPUMASK_VAR(saved_mask, allmasks);
+	CPUMASK_VAR(set_mask, allmasks);
+	CPUMASK_VAR(covered_cpus, allmasks);
+
+	if (unlikely(allmasks == NULL))
+		return -ENOMEM;
+
+	if (unlikely(centrino_model[cpu] == NULL)) {
+		retval = -ENODEV;
+		goto out;
+	}
 
 	if (unlikely(cpufreq_frequency_table_target(policy,
 			centrino_model[cpu]->op_points,
 			target_freq,
 			relation,
 			&newstate))) {
-		return -EINVAL;
+		retval = -EINVAL;
+		goto out;
 	}
 
 #ifdef CONFIG_HOTPLUG_CPU
 	/* cpufreq holds the hotplug lock, so we are safe from here on */
-	cpus_and(online_policy_cpus, cpu_online_map, policy->cpus);
+	cpus_and(*online_policy_cpus, cpu_online_map, policy->cpus);
 #else
-	online_policy_cpus = policy->cpus;
+	*online_policy_cpus = policy->cpus;
 #endif
 
-	saved_mask = current->cpus_allowed;
+	*saved_mask = current->cpus_allowed;
 	first_cpu = 1;
-	cpus_clear(covered_cpus);
-	for_each_cpu_mask_nr(j, online_policy_cpus) {
+	cpus_clear(*covered_cpus);
+	for_each_cpu_mask_nr(j, *online_policy_cpus) {
 		/*
 		 * Support for SMP systems.
 		 * Make sure we are running on CPU that wants to change freq
 		 */
-		cpus_clear(set_mask);
+		cpus_clear(*set_mask);
 		if (policy->shared_type == CPUFREQ_SHARED_TYPE_ANY)
-			cpus_or(set_mask, set_mask, online_policy_cpus);
+			cpus_or(*set_mask, *set_mask, *online_policy_cpus);
 		else
-			cpu_set(j, set_mask);
+			cpu_set(j, *set_mask);
 
-		set_cpus_allowed_ptr(current, &set_mask);
+		set_cpus_allowed_ptr(current, set_mask);
 		preempt_disable();
-		if (unlikely(!cpu_isset(smp_processor_id(), set_mask))) {
+		if (unlikely(!cpu_isset(smp_processor_id(), *set_mask))) {
 			dprintk("couldn't limit to CPUs in this domain\n");
 			retval = -EAGAIN;
 			if (first_cpu) {
@@ -611,7 +625,7 @@ static int centrino_target (struct cpufr
 			dprintk("target=%dkHz old=%d new=%d msr=%04x\n",
 				target_freq, freqs.old, freqs.new, msr);
 
-			for_each_cpu_mask_nr(k, online_policy_cpus) {
+			for_each_cpu_mask_nr(k, *online_policy_cpus) {
 				freqs.cpu = k;
 				cpufreq_notify_transition(&freqs,
 					CPUFREQ_PRECHANGE);
@@ -630,11 +644,11 @@ static int centrino_target (struct cpufr
 			break;
 		}
 
-		cpu_set(j, covered_cpus);
+		cpu_set(j, *covered_cpus);
 		preempt_enable();
 	}
 
-	for_each_cpu_mask_nr(k, online_policy_cpus) {
+	for_each_cpu_mask_nr(k, *online_policy_cpus) {
 		freqs.cpu = k;
 		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 	}
@@ -647,10 +661,10 @@ static int centrino_target (struct cpufr
 		 * Best effort undo..
 		 */
 
-		if (!cpus_empty(covered_cpus)) {
+		if (!cpus_empty(*covered_cpus)) {
 			cpumask_of_cpu_ptr_declare(new_mask);
 
-			for_each_cpu_mask_nr(j, covered_cpus) {
+			for_each_cpu_mask_nr(j, *covered_cpus) {
 				cpumask_of_cpu_ptr_next(new_mask, j);
 				set_cpus_allowed_ptr(current, new_mask);
 				wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
@@ -660,19 +674,22 @@ static int centrino_target (struct cpufr
 		tmp = freqs.new;
 		freqs.new = freqs.old;
 		freqs.old = tmp;
-		for_each_cpu_mask_nr(j, online_policy_cpus) {
+		for_each_cpu_mask_nr(j, *online_policy_cpus) {
 			freqs.cpu = j;
 			cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
 			cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 		}
 	}
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return 0;
+	set_cpus_allowed_ptr(current, saved_mask);
+	retval = 0;
+	goto out;
 
 migrate_end:
 	preempt_enable();
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return 0;
+	set_cpus_allowed_ptr(current, saved_mask);
+out:
+	CPUMASK_FREE(allmasks);
+	return retval;
 }
 
 static struct freq_attr* centrino_attr[] = {

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros
  2008-07-15 21:14 ` [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros Mike Travis
@ 2008-07-16  6:41   ` Bert Wesarg
  2008-07-16 12:25     ` Mike Travis
  0 siblings, 1 reply; 26+ messages in thread
From: Bert Wesarg @ 2008-07-16  6:41 UTC (permalink / raw)
  To: Mike Travis
  Cc: Rusty Russell, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Christoph Lameter, Jack Steiner, linux-kernel, Paul Jackson

On Tue, Jul 15, 2008 at 23:14, Mike Travis <travis@sgi.com> wrote:
>  * Provide a generic set of CPUMASK_ALLOC macros patterned after the
>    SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
>    variables are declared on the stack to reduce the amount of stack
>    space required.
>
> Based on linux-2.6.tip/master at the following commit:
>
>    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
>    Merge: 9a635fa... 724dce0...
>    Author: Ingo Molnar <mingo@elte.hu>
>    Date:   Tue Jul 15 14:55:17 2008 +0200
>
> Signed-off-by: Mike Travis <travis@sgi.com>
> Cc: Paul Jackson <pj@sgi.com>
> ---
>  include/linux/cpumask.h |   21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> --- linux-2.6.tip.orig/include/linux/cpumask.h
> +++ linux-2.6.tip/include/linux/cpumask.h
> @@ -75,6 +75,17 @@
>  * CPU_MASK_NONE                       Initializer - no bits set
>  * unsigned long *cpus_addr(mask)      Array of unsigned long's in mask
>  *
> + *if NR_CPUS > BITS_PER_LONG
> + *   CPUMASK_ALLOC(m)                  Declares and allocates struct m *m =
> + *                                        (struct m *)kmalloc(sizeof(*m), ...)
Shouldn't you mention the GFP_KERNEL flag? And the cast should not
necessarily be mentioned in a comment.

> + *   CPUMASK_FREE(m)                   Macro for kfree(v)
                                                    kfree(m)

> + *else
> + *   CPUMASK_ALLOC(m)                  Declares struct m _m, *m = &_m
> + *   CPUMASK_FREE(m)                   Nop
> + *endif
> + *   CPUMASK_VAR(v, m)                 Declares cpumask_t *v =
> + *                                             m + offset(struct m, v)
                                                      offsetof
and why can't you use a &(m->v)?

Regards
Bert

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
  2008-07-15 21:14 ` [PATCH 4/8] cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c Mike Travis
@ 2008-07-16 11:17   ` Dean Nelson
  0 siblings, 0 replies; 26+ messages in thread
From: Dean Nelson @ 2008-07-16 11:17 UTC (permalink / raw)
  To: Mike Travis
  Cc: Rusty Russell, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Christoph Lameter, Jack Steiner, linux-kernel

On Tue, Jul 15, 2008 at 02:14:33PM -0700, Mike Travis wrote:
>   * Optimize various places where a pointer to the cpumask_of_cpu value
>     will result in reducing stack pressure.
> 
> Based on linux-2.6.tip/master at the following commit:
> 
>     commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
>     Merge: 9a635fa... 724dce0...
>     Author: Ingo Molnar <mingo@elte.hu>
>     Date:   Tue Jul 15 14:55:17 2008 +0200
> 
> Signed-off-by: Mike Travis <travis@sgi.com>
> Cc: Dean Nelson <dcn@sgi.com>
> ---
>  drivers/misc/sgi-xp/xpc_main.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- linux-2.6.tip.orig/drivers/misc/sgi-xp/xpc_main.c
> +++ linux-2.6.tip/drivers/misc/sgi-xp/xpc_main.c
> @@ -229,10 +229,11 @@ xpc_hb_checker(void *ignore)
>  	int last_IRQ_count = 0;
>  	int new_IRQ_count;
>  	int force_IRQ = 0;
> +	cpumask_of_cpu_ptr(cpumask, XPC_HB_CHECK_CPU);
>  
>  	/* this thread was marked active by xpc_hb_init() */
>  
> -	set_cpus_allowed(current, cpumask_of_cpu(XPC_HB_CHECK_CPU));
> +	set_cpus_allowed_ptr(current, cpumask);
>  
>  	/* set our heartbeating to other partitions into motion */
>  	xpc_hb_check_timeout = jiffies + (xpc_hb_check_interval * HZ);
> 

Acked-by: Dean Nelson <dcn@sgi.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros
  2008-07-16  6:41   ` Bert Wesarg
@ 2008-07-16 12:25     ` Mike Travis
  2008-07-16 13:01       ` Mike Travis
  0 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-16 12:25 UTC (permalink / raw)
  To: Bert Wesarg
  Cc: Rusty Russell, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Christoph Lameter, Jack Steiner, linux-kernel, Paul Jackson, LKML

Bert Wesarg wrote:
> On Tue, Jul 15, 2008 at 23:14, Mike Travis <travis@sgi.com> wrote:
>>  * Provide a generic set of CPUMASK_ALLOC macros patterned after the
>>    SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
>>    variables are declared on the stack to reduce the amount of stack
>>    space required.
>>
>> Based on linux-2.6.tip/master at the following commit:
>>
>>    commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
>>    Merge: 9a635fa... 724dce0...
>>    Author: Ingo Molnar <mingo@elte.hu>
>>    Date:   Tue Jul 15 14:55:17 2008 +0200
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> Cc: Paul Jackson <pj@sgi.com>
>> ---
>>  include/linux/cpumask.h |   21 +++++++++++++++++++++
>>  1 file changed, 21 insertions(+)
>>
>> --- linux-2.6.tip.orig/include/linux/cpumask.h
>> +++ linux-2.6.tip/include/linux/cpumask.h
>> @@ -75,6 +75,17 @@
>>  * CPU_MASK_NONE                       Initializer - no bits set
>>  * unsigned long *cpus_addr(mask)      Array of unsigned long's in mask
>>  *
>> + *if NR_CPUS > BITS_PER_LONG
>> + *   CPUMASK_ALLOC(m)                  Declares and allocates struct m *m =
>> + *                                        (struct m *)kmalloc(sizeof(*m), ...)
> Shouldn't you mention the GFP_KERNEL flag? And the cast should not
> necessarily be mentioned in a comment.

Hmm, yes.  I encountered the checkpatch warning about casting kmalloc and
changed the macro but not the comment.  (And now there's room for adding
GFP_KERNEL to the comment! ;-)
> 
>> + *   CPUMASK_FREE(m)                   Macro for kfree(v)
>                                                     kfree(m)
> 
>> + *else
>> + *   CPUMASK_ALLOC(m)                  Declares struct m _m, *m = &_m
>> + *   CPUMASK_FREE(m)                   Nop
>> + *endif
>> + *   CPUMASK_VAR(v, m)                 Declares cpumask_t *v =
>> + *                                             m + offset(struct m, v)
>                                                       offsetof
> and why can't you use a &(m->v)?

I thought about this but what if kmalloc fails?  It's ok to add the
offset to a NULL pointer, but dereferencing a null pointer (even though
it's just to get the address) may introduce a fault, yes?

I'll look into this further.  There was one other remnant from the
copy/paste that needed to be fixed:

-#define        CPUMASK_ALLOC(m)        struct allmasks _m, *m = &_m
+#define        CPUMASK_ALLOC(m)        struct m _m, *m = &_m

Thanks!
Mike

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros
  2008-07-16 12:25     ` Mike Travis
@ 2008-07-16 13:01       ` Mike Travis
  0 siblings, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-16 13:01 UTC (permalink / raw)
  To: Bert Wesarg
  Cc: Rusty Russell, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Christoph Lameter, Jack Steiner, linux-kernel, Paul Jackson

Mike Travis wrote:
> Bert Wesarg wrote:
...
>>> + *   CPUMASK_VAR(v, m)                 Declares cpumask_t *v =
>>> + *                                             m + offset(struct m, v)
>>                                                       offsetof
>> and why can't you use a &(m->v)?
> 
> I thought about this but what if kmalloc fails?  It's ok to add the
> offset to a NULL pointer, but dereferencing a null pointer (even though
> it's just to get the address) may introduce a fault, yes?
> 
> I'll look into this further.  

Answering my own question, apparently it is ok.  The pointer is simply used
to provide the base to which the offset is added.  

        struct allmasks *allmasks = ((void *)0);
  400557:       48 c7 45 d8 00 00 00    movq   $0x0,0xffffffffffffffd8(%rbp)
  40055e:       00

        cpumask_t *online_policy_cpus = &(allmasks->online_policy_cpus);
  40055f:       48 8b 45 d8             mov    0xffffffffffffffd8(%rbp),%rax
  400563:       48 89 45 e0             mov    %rax,0xffffffffffffffe0(%rbp)
        cpumask_t *saved_mask = &(allmasks->saved_mask);
  400567:       48 8b 45 d8             mov    0xffffffffffffffd8(%rbp),%rax
  40056b:       48 05 00 02 00 00       add    $0x200,%rax
  400571:       48 89 45 e8             mov    %rax,0xffffffffffffffe8(%rbp)
        cpumask_t *set_mask = &(allmasks->set_mask);
  400575:       48 8b 45 d8             mov    0xffffffffffffffd8(%rbp),%rax
  400579:       48 05 00 04 00 00       add    $0x400,%rax
  40057f:       48 89 45 f0             mov    %rax,0xfffffffffffffff0(%rbp)
        cpumask_t *covered_cpus = &(allmasks->covered_cpus);
  400583:       48 8b 45 d8             mov    0xffffffffffffffd8(%rbp),%rax
  400587:       48 05 00 06 00 00       add    $0x600,%rax
  40058d:       48 89 45 f8             mov    %rax,0xfffffffffffffff8(%rbp)

Sometimes the most obvious is also the most elusive... ;-)

I've updated the code and the patch description to clarify the checking
of a NULL structure base before using the cpumask_t pointers.  I've also
changed CPUMASK_VAR to CPUMASK_PTR to be a bit more clear on it's function.

  * Provide a generic set of CPUMASK_ALLOC macros patterned after the
    SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
    variables are declared on the stack to reduce the amount of stack
    space required when the NR_CPUS count is large enough to warrant it.

    Basically, if NR_CPUS <= BITS_PER_LONG then the multiple cpumask_t
    structure (which needs to be pre-defined) is declared as a local
    variable and pointers to each mask is provided.  The compiler
    will optimize out the extra dereference, resulting in code that
    is the same without the pointer reference.

    If NR_CPUS > BITS_PER_LONG, then instead of declaring the combined
    cpumask_t structure on the stack, kmalloc is used to obtain the
    memory space.  In this case, the CPUMASK_FREE is now kfree instead
    of a nop.

    For both cases, CPUMASK_PTR declares and initializes each cpumask_t
    pointer but these should *not* be used before the structure pointer
    is verified not to be NULL.  (This check for NULL will be optimized
    out for the case where the structure is declared as local memory.)

One question that remains, should the threshold to use kmalloc be
BITS_PER_LONG or something larger?  Sched uses NR_CPUS > 128, though
it has about 7 cpumask_t vars it uses.  My (obvious) concern is when
NR_CPUS is 4096 (and soon 16384), but where is the line between a
fairly large system and a really huge system?

Thanks,
Mike

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-15 21:14 ` [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Mike Travis
@ 2008-07-18  5:30   ` Rusty Russell
  2008-07-18 13:43     ` Mike Travis
  0 siblings, 1 reply; 26+ messages in thread
From: Rusty Russell @ 2008-07-18  5:30 UTC (permalink / raw)
  To: Mike Travis
  Cc: Ingo Molnar, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner

On Wednesday 16 July 2008 07:14:30 Mike Travis wrote:
>   * This patch replaces the dangerous lvalue version of cpumask_of_cpu
>     with new cpumask_of_cpu_ptr macros.  These are patterned after the
>     node_to_cpumask_ptr macros.

Hi Mike,

   Should we just put cpumask_of_cpu_map[] in generic code and then have 
cpumask_of_cpu() always return a cpumask_t pointer?  These macros which 
declare things which may be one of two types is a real penalty for code 
readability.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-18  5:30   ` Rusty Russell
@ 2008-07-18 13:43     ` Mike Travis
  2008-07-20 10:03       ` Rusty Russell
  0 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-18 13:43 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Ingo Molnar, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner

Rusty Russell wrote:
> On Wednesday 16 July 2008 07:14:30 Mike Travis wrote:
>>   * This patch replaces the dangerous lvalue version of cpumask_of_cpu
>>     with new cpumask_of_cpu_ptr macros.  These are patterned after the
>>     node_to_cpumask_ptr macros.
> 
> Hi Mike,
> 
>    Should we just put cpumask_of_cpu_map[] in generic code and then have 
> cpumask_of_cpu() always return a cpumask_t pointer?  These macros which 
> declare things which may be one of two types is a real penalty for code 
> readability.
> 
> Thanks,
> Rusty.

Hi,

I wouldn't mind it at all, and since it's almost always calling a function
that requires a cpumask_t pointer (like the cpu_* ops or set_cpus_allowed_ptr)
then there shouldn't be too many "pointer dereference" penalties.  I'm just
always a bit hesitant to make too many generic changes since I have only x86
and ia64 machines to test with.

But there's a few of these new "fake" pointer macros (well, at least two... ;-),
so we'll either need more of these types of macros, or we have to consider using
pointers for almost all cpumask_t args.  The next jump to 16k cpus will use
2k bytes of stack space for each cpumask_t arg, instead of the current "measly"
512 bytes.

Another thought I had is perhaps cpumask.h should define something that indicates
a "huge NR_CPUS count" that is used globally to trigger things like kmalloc of
cpumask variables, instead of declaring them on the stack...?  Or (as has been
discussed in the past), maybe a new cpumask_t type will be needed?

Thanks,
Mike


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations
  2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
                   ` (7 preceding siblings ...)
  2008-07-15 21:14 ` [PATCH 8/8] cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target Mike Travis
@ 2008-07-18 20:04 ` Ingo Molnar
  8 siblings, 0 replies; 26+ messages in thread
From: Ingo Molnar @ 2008-07-18 20:04 UTC (permalink / raw)
  To: Mike Travis
  Cc: Rusty Russell, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel


* Mike Travis <travis@sgi.com> wrote:

>   * Patch 1 replaces the dangerous lvalue version of cpumask_of_cpu
>     with new cpumask_of_cpu_ptr macros.  These are patterned after the
>     node_to_cpumask_ptr macros.
> 
>   * Patches 2..6 optimizes various places where a pointer to the cpumask_of_cpu
>     value will result in reducing stack pressure.
> 
>   * Patch 7 provides a generic set of CPUMASK_ALLOC macros patterned after
>     the SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
>     variables are declared on the stack to reduce the amount of stack space
>     required.
> 
>   * Patch 8 uses the CPUMASK_ALLOC macros in the centrino_target() function.

applied to tip/cpus4096 for more testing, thanks Mike.

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
  2008-07-15 21:14 ` [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c Mike Travis
@ 2008-07-18 20:33   ` Ingo Molnar
  2008-07-18 20:35     ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2008-07-18 20:33 UTC (permalink / raw)
  To: Mike Travis
  Cc: Rusty Russell, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel


hmmm:

> --- linux-2.6.tip.orig/lib/smp_processor_id.c
> +++ linux-2.6.tip/lib/smp_processor_id.c

>  	unsigned long preempt_count = preempt_count();
>  	int this_cpu = raw_smp_processor_id();
> -	cpumask_t this_mask;
> +	cpumask_of_cpu_ptr_declare(this_mask);

> -	this_mask = cpumask_of_cpu(this_cpu);
> +	cpumask_of_cpu_ptr_next(this_mask, cpu);

'cpu' is not declared ...

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
  2008-07-18 20:33   ` Ingo Molnar
@ 2008-07-18 20:35     ` Ingo Molnar
  2008-07-18 22:36       ` Mike Travis
  0 siblings, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2008-07-18 20:35 UTC (permalink / raw)
  To: Mike Travis
  Cc: Rusty Russell, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel


* Ingo Molnar <mingo@elte.hu> wrote:

> hmmm:
> 
> > --- linux-2.6.tip.orig/lib/smp_processor_id.c
> > +++ linux-2.6.tip/lib/smp_processor_id.c
> 
> >  	unsigned long preempt_count = preempt_count();
> >  	int this_cpu = raw_smp_processor_id();
> > -	cpumask_t this_mask;
> > +	cpumask_of_cpu_ptr_declare(this_mask);
> 
> > -	this_mask = cpumask_of_cpu(this_cpu);
> > +	cpumask_of_cpu_ptr_next(this_mask, cpu);
> 
> 'cpu' is not declared ...

commit 06f8d00e9eecb738c99b737ac38a585ea7583ad5
Author: Ingo Molnar <mingo@elte.hu>
Date:   Fri Jul 18 22:34:00 2008 +0200

    cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
    
    fix typo.
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 lib/smp_processor_id.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
index 9fb2df0..c4381d9 100644
--- a/lib/smp_processor_id.c
+++ b/lib/smp_processor_id.c
@@ -23,7 +23,7 @@ notrace unsigned int debug_smp_processor_id(void)
 	 * Kernel threads bound to a single CPU can safely use
 	 * smp_processor_id():
 	 */
-	cpumask_of_cpu_ptr_next(this_mask, cpu);
+	cpumask_of_cpu_ptr_next(this_mask, this_cpu);
 
 	if (cpus_equal(current->cpus_allowed, *this_mask))
 		goto out;

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
  2008-07-18 20:35     ` Ingo Molnar
@ 2008-07-18 22:36       ` Mike Travis
  0 siblings, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-18 22:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rusty Russell, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel

Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
>> hmmm:
>>
>>> --- linux-2.6.tip.orig/lib/smp_processor_id.c
>>> +++ linux-2.6.tip/lib/smp_processor_id.c
>>>  	unsigned long preempt_count = preempt_count();
>>>  	int this_cpu = raw_smp_processor_id();
>>> -	cpumask_t this_mask;
>>> +	cpumask_of_cpu_ptr_declare(this_mask);
>>> -	this_mask = cpumask_of_cpu(this_cpu);
>>> +	cpumask_of_cpu_ptr_next(this_mask, cpu);
>> 'cpu' is not declared ...

Hmm is right...!  ;-)

Went through my logs and didn't see the file explicitly listed as
compiled but it still should have caught it.

Thanks for the fix!

> 
> commit 06f8d00e9eecb738c99b737ac38a585ea7583ad5
> Author: Ingo Molnar <mingo@elte.hu>
> Date:   Fri Jul 18 22:34:00 2008 +0200
> 
>     cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
>     
>     fix typo.
>     
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  lib/smp_processor_id.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
> index 9fb2df0..c4381d9 100644
> --- a/lib/smp_processor_id.c
> +++ b/lib/smp_processor_id.c
> @@ -23,7 +23,7 @@ notrace unsigned int debug_smp_processor_id(void)
>  	 * Kernel threads bound to a single CPU can safely use
>  	 * smp_processor_id():
>  	 */
> -	cpumask_of_cpu_ptr_next(this_mask, cpu);
> +	cpumask_of_cpu_ptr_next(this_mask, this_cpu);
>  
>  	if (cpus_equal(current->cpus_allowed, *this_mask))
>  		goto out;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-18 13:43     ` Mike Travis
@ 2008-07-20 10:03       ` Rusty Russell
  2008-07-23 11:20         ` Ingo Molnar
  2008-07-23 17:45         ` Mike Travis
  0 siblings, 2 replies; 26+ messages in thread
From: Rusty Russell @ 2008-07-20 10:03 UTC (permalink / raw)
  To: Mike Travis
  Cc: Ingo Molnar, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner

On Friday 18 July 2008 23:43:07 Mike Travis wrote:
> Rusty Russell wrote:
> > On Wednesday 16 July 2008 07:14:30 Mike Travis wrote:
> >>   * This patch replaces the dangerous lvalue version of cpumask_of_cpu
> >>     with new cpumask_of_cpu_ptr macros.  These are patterned after the
> >>     node_to_cpumask_ptr macros.
> >
> > Hi Mike,
> >
> >    Should we just put cpumask_of_cpu_map[] in generic code and then have
> > cpumask_of_cpu() always return a cpumask_t pointer?  These macros which
> > declare things which may be one of two types is a real penalty for code
> > readability.
> >
> > Thanks,
> > Rusty.
>
> Hi,
>
> I wouldn't mind it at all, and since it's almost always calling a function
> that requires a cpumask_t pointer (like the cpu_* ops or
> set_cpus_allowed_ptr) then there shouldn't be too many "pointer
> dereference" penalties.  I'm just always a bit hesitant to make too many
> generic changes since I have only x86 and ia64 machines to test with.

The simple version is just a static array of [NR_CPUS] cpumask_t's.  Do that, 
with an override for smarter archs?

I really REALLY prefer that over the fairly tortuous macros.

> Another thought I had is perhaps cpumask.h should define something that
> indicates a "huge NR_CPUS count" that is used globally to trigger things
> like kmalloc of cpumask variables, instead of declaring them on the
> stack...?  Or (as has been discussed in the past), maybe a new cpumask_t
> type will be needed?

AFAICT the final answer has to be a get_cpu_mask()/put_cpu_mask(), which 
sleeps and doesn't nest (so we can use a pool allocator).  Of course, that 
kind of analysis is non-trivial, so I suggest that's not for this merge 
window...

Want me to try something and see if it boots?
Rusty.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-20 10:03       ` Rusty Russell
@ 2008-07-23 11:20         ` Ingo Molnar
  2008-07-23 11:23           ` Ingo Molnar
  2008-07-23 14:16           ` Mike Travis
  2008-07-23 17:45         ` Mike Travis
  1 sibling, 2 replies; 26+ messages in thread
From: Ingo Molnar @ 2008-07-23 11:20 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Mike Travis, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner


* Rusty Russell <rusty@rustcorp.com.au> wrote:

> > I wouldn't mind it at all, and since it's almost always calling a 
> > function that requires a cpumask_t pointer (like the cpu_* ops or 
> > set_cpus_allowed_ptr) then there shouldn't be too many "pointer 
> > dereference" penalties.  I'm just always a bit hesitant to make too 
> > many generic changes since I have only x86 and ia64 machines to test 
> > with.
> 
> The simple version is just a static array of [NR_CPUS] cpumask_t's.  
> Do that, with an override for smarter archs?
> 
> I really REALLY prefer that over the fairly tortuous macros.

a fresh commit in -git has exposed the topology.h mess - see the hack 
below. We now have diverging versions of topology_core_siblings() 
semantics - that sure cannot be right. Mike?

	Ingo

------->
commit 695a6b456307455a10059512208e8ed0d376ecd3
Author: Ingo Molnar <mingo@elte.hu>
Date:   Wed Jul 23 13:19:44 2008 +0200

    topology: work around topology_core_siblings() breakage
    
    work around:
    
    drivers/net/sfc/efx.c: In function ‘efx_probe_interrupts':
    drivers/net/sfc/efx.c:845: error: lvalue required as unary ‘&' operand
    
    the topology API is a mess right now ...
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 drivers/net/sfc/efx.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 45c72ee..1ababfa 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -842,8 +842,10 @@ static void efx_probe_interrupts(struct efx_nic *efx)
 			for_each_online_cpu(cpu) {
 				if (!cpu_isset(cpu, core_mask)) {
 					++efx->rss_queues;
+#if 0
 					cpus_or(core_mask, core_mask,
 						topology_core_siblings(cpu));
+#endif
 				}
 			}
 		} else {


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-23 11:20         ` Ingo Molnar
@ 2008-07-23 11:23           ` Ingo Molnar
  2008-07-23 14:26             ` Mike Travis
  2008-07-23 14:16           ` Mike Travis
  1 sibling, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2008-07-23 11:23 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Mike Travis, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner


* Ingo Molnar <mingo@elte.hu> wrote:

> > The simple version is just a static array of [NR_CPUS] cpumask_t's.  
> > Do that, with an override for smarter archs?
> > 
> > I really REALLY prefer that over the fairly tortuous macros.
> 
> a fresh commit in -git has exposed the topology.h mess - see the hack 
> below. We now have diverging versions of topology_core_siblings() 
> semantics - that sure cannot be right. Mike?

also i had to do the net/sunrpc/svc.c fixup below.

	Ingo

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 5a32cb7..835d274 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -310,7 +310,8 @@ svc_pool_map_set_cpumask(struct task_struct *task, unsigned int pidx)
 	switch (m->mode) {
 	case SVC_POOL_PERCPU:
 	{
-		set_cpus_allowed_ptr(task, &cpumask_of_cpu(node));
+		cpumask_of_cpu_ptr(cpumask, node);
+		set_cpus_allowed_ptr(task, cpumask);
 		break;
 	}
 	case SVC_POOL_PERNODE:

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-23 11:20         ` Ingo Molnar
  2008-07-23 11:23           ` Ingo Molnar
@ 2008-07-23 14:16           ` Mike Travis
  1 sibling, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-23 14:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rusty Russell, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner

Ingo Molnar wrote:
> * Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
>>> I wouldn't mind it at all, and since it's almost always calling a 
>>> function that requires a cpumask_t pointer (like the cpu_* ops or 
>>> set_cpus_allowed_ptr) then there shouldn't be too many "pointer 
>>> dereference" penalties.  I'm just always a bit hesitant to make too 
>>> many generic changes since I have only x86 and ia64 machines to test 
>>> with.
>> The simple version is just a static array of [NR_CPUS] cpumask_t's.  
>> Do that, with an override for smarter archs?
>>
>> I really REALLY prefer that over the fairly tortuous macros.
> 
> a fresh commit in -git has exposed the topology.h mess - see the hack 
> below. We now have diverging versions of topology_core_siblings() 
> semantics - that sure cannot be right. Mike?
> 
> 	Ingo
> 
> ------->
> commit 695a6b456307455a10059512208e8ed0d376ecd3
> Author: Ingo Molnar <mingo@elte.hu>
> Date:   Wed Jul 23 13:19:44 2008 +0200
> 
>     topology: work around topology_core_siblings() breakage
>     
>     work around:
>     
>     drivers/net/sfc/efx.c: In function ‘efx_probe_interrupts':
>     drivers/net/sfc/efx.c:845: error: lvalue required as unary ‘&' operand
>     
>     the topology API is a mess right now ...
>     
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  drivers/net/sfc/efx.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
> index 45c72ee..1ababfa 100644
> --- a/drivers/net/sfc/efx.c
> +++ b/drivers/net/sfc/efx.c
> @@ -842,8 +842,10 @@ static void efx_probe_interrupts(struct efx_nic *efx)
>  			for_each_online_cpu(cpu) {
>  				if (!cpu_isset(cpu, core_mask)) {
>  					++efx->rss_queues;
> +#if 0
>  					cpus_or(core_mask, core_mask,
>  						topology_core_siblings(cpu));
> +#endif
>  				}
>  			}
>  		} else {

Ahh, yes, I see it now.  If you don't define topology_core_siblings then you get:

#ifndef topology_core_siblings
#define topology_core_siblings(cpu)             cpumask_of_cpu(cpu)
#endif

And of course this is no longer an lvalue.

Rusty - if you don't think there'll be objections from other arches I can put in
a generic cpumask_of_cpu_map[].

Thanks,
Mike




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-23 11:23           ` Ingo Molnar
@ 2008-07-23 14:26             ` Mike Travis
  2008-07-24  0:46               ` Rusty Russell
  0 siblings, 1 reply; 26+ messages in thread
From: Mike Travis @ 2008-07-23 14:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rusty Russell, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner

Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
>>> The simple version is just a static array of [NR_CPUS] cpumask_t's.  
>>> Do that, with an override for smarter archs?
>>>
>>> I really REALLY prefer that over the fairly tortuous macros.
>> a fresh commit in -git has exposed the topology.h mess - see the hack 
>> below. We now have diverging versions of topology_core_siblings() 
>> semantics - that sure cannot be right. Mike?
> 
> also i had to do the net/sunrpc/svc.c fixup below.
> 
> 	Ingo
> 
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index 5a32cb7..835d274 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -310,7 +310,8 @@ svc_pool_map_set_cpumask(struct task_struct *task, unsigned int pidx)
>  	switch (m->mode) {
>  	case SVC_POOL_PERCPU:
>  	{
> -		set_cpus_allowed_ptr(task, &cpumask_of_cpu(node));
> +		cpumask_of_cpu_ptr(cpumask, node);
> +		set_cpus_allowed_ptr(task, cpumask);
>  		break;
>  	}
>  	case SVC_POOL_PERNODE:


I had sent in a patch that has this change.

Standby, the "generic" cpumask_of_cpu_map patch is coming... just testing now.

THanks,
Mike

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-20 10:03       ` Rusty Russell
  2008-07-23 11:20         ` Ingo Molnar
@ 2008-07-23 17:45         ` Mike Travis
  1 sibling, 0 replies; 26+ messages in thread
From: Mike Travis @ 2008-07-23 17:45 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Ingo Molnar, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Paul Jackson, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner

Rusty Russell wrote:
...
>> Another thought I had is perhaps cpumask.h should define something that
>> indicates a "huge NR_CPUS count" that is used globally to trigger things
>> like kmalloc of cpumask variables, instead of declaring them on the
>> stack...?  Or (as has been discussed in the past), maybe a new cpumask_t
>> type will be needed?
> 
> AFAICT the final answer has to be a get_cpu_mask()/put_cpu_mask(), which 
> sleeps and doesn't nest (so we can use a pool allocator).  Of course, that 
> kind of analysis is non-trivial, so I suggest that's not for this merge 
> window...
> 
> Want me to try something and see if it boots?
> Rusty.

Hi Rusty,

There are a number of occasions where a function declares a temporary cpumask_t
variable on the stack to hold (say) current->cpus_allowed.  I tried a couple of
options early on to a.) reserve one or two cpumask_t variables in the task
struct; and b.) reserve one or two cpumask_t variables per cpu.  Both had weird
consequences in some usages and since 4096 is *only* 512 bytes, it didn't seem
worth the effort.  Our next iteration will have NR_CPUS=16384 and therefore
removing all stack declared cpumask_t variables is highly desirable.

Your idea of a pool allocator is very interesting... ;-)

Thanks,
Mike

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  2008-07-23 14:26             ` Mike Travis
@ 2008-07-24  0:46               ` Rusty Russell
  0 siblings, 0 replies; 26+ messages in thread
From: Rusty Russell @ 2008-07-24  0:46 UTC (permalink / raw)
  To: Mike Travis
  Cc: Ingo Molnar, Andrew Morton, H. Peter Anvin, Christoph Lameter,
	Jack Steiner, linux-kernel, Len Brown, Dave Jones, Paul Jackson,
	Tigran Aivazian, Robert Richter, Greg Banks, Eric W. Biederman,
	Adrian Bunk, Thomas Gleixner, Andreas Schwab, Johannes Weiner

On Thursday 24 July 2008 00:26:40 Mike Travis wrote:
> Standby, the "generic" cpumask_of_cpu_map patch is coming... just testing
> now.

Thanks Mike, that's going to be a huge improvement in clarity.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-07-24  3:45 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-15 21:14 [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Mike Travis
2008-07-15 21:14 ` [PATCH 1/8] cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Mike Travis
2008-07-18  5:30   ` Rusty Russell
2008-07-18 13:43     ` Mike Travis
2008-07-20 10:03       ` Rusty Russell
2008-07-23 11:20         ` Ingo Molnar
2008-07-23 11:23           ` Ingo Molnar
2008-07-23 14:26             ` Mike Travis
2008-07-24  0:46               ` Rusty Russell
2008-07-23 14:16           ` Mike Travis
2008-07-23 17:45         ` Mike Travis
2008-07-15 21:14 ` [PATCH 2/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c Mike Travis
2008-07-15 21:14 ` [PATCH 3/8] cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c Mike Travis
2008-07-15 21:14 ` [PATCH 4/8] cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c Mike Travis
2008-07-16 11:17   ` Dean Nelson
2008-07-15 21:14 ` [PATCH 5/8] cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c Mike Travis
2008-07-15 21:14 ` [PATCH 6/8] cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c Mike Travis
2008-07-18 20:33   ` Ingo Molnar
2008-07-18 20:35     ` Ingo Molnar
2008-07-18 22:36       ` Mike Travis
2008-07-15 21:14 ` [PATCH 7/8] cpumask: Provide a generic set of CPUMASK_ALLOC macros Mike Travis
2008-07-16  6:41   ` Bert Wesarg
2008-07-16 12:25     ` Mike Travis
2008-07-16 13:01       ` Mike Travis
2008-07-15 21:14 ` [PATCH 8/8] cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target Mike Travis
2008-07-18 20:04 ` [PATCH 0/8] cpumask: Replace/optimize cpumask_of_cpu & cpumask_t operations Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).