* [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3
@ 2008-04-05 1:11 Mike Travis
2008-04-05 1:11 ` [PATCH 01/12] x86: Convert cpumask_of_cpu macro to allocated array Mike Travis
` (11 more replies)
0 siblings, 12 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
Modify usage of cpumask_t variables to use pointers as much as possible.
Changes are:
* Use an allocated array of cpumask_t's for cpumask_of_cpu().
This removes > 20,000 bytes of stack usage (see all changes
in the chart below),
as well as reduces the code generated for each usage.
* Use set_cpus_allowed_ptr() to pass a pointer to the "newly allowed"
cpumask. This removes > 10,000 bytes of stack usage.
* Use node_to_cpumask_ptr that returns pointer to cpumask for the
specified node. This removes > 10,000 bytes of stack usage.
* Modify build_sched_domains and related sub-functions to pass
pointers to cpumask temp variables. This consolidates stack
space that was spread over various functions.
* Remove large array from numa_initmem_init() [> 8,000 bytes].
* Optimize usages of {CPU,NODE}_MASK_{NONE,ALL} [> 9,000 bytes].
* Various other changes to reduce stacksize and silence checkpatch
warnings [ > 7,000 bytes].
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Cc: Cliff Wickman <cpw@sgi.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Banks <gnb@melbourne.sgi.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Len Brown <len.brown@intel.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William L. Irwin <wli@holomorphy.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
v3: rebased on x86/latest + sched-devel/latest
collapsed many patches so same files don't appear in different
patches (kernel/sched.c unfortunately still appears in about 5
or 6 patches, and include/asm-x86/topology.h is in 2.)
v2: resubmitted based on x86/latest.
--- ---------------------------------------------------------
* Memory Usages Changes
Patch list summary of various memory usage changes using the akpm2
config file with NR_CPUS=4096 and MAX_NUMNODES=512.
====== Data (-l 500)
... files 13 vars 1855 all 0 lim 500 unch 0
1 - initial
2 - cpumask_of_cpu
3 - add-CPUMASK_ALL_PTR
5 - nr_cpus-in-kernel_sched
7 - generic-set_cpus_allowed
12 - kernel_sched_c
13 - use-CPUMASK_ALL_PTR
.1. .2. .3. .5. .7. .12. .13. ..final..
32768 . . -32768 . . . . -100% sched_group_nodes_bycpu(.bss)
32768 . . -32768 . . . . -100% init_sched_rt_entity_p(.bss)
32768 . . -32768 . . . . -100% init_rt_rq_p(.bss)
3550 . . +9 . -842 . 2717 -23% build_sched_domains(.text)
674 -95 . . -579 . . . -100% acpi_processor_set_throttling(.text)
533 -533 . . . . . . -100% hpet_enable(.init.text)
512 . . . . . -512 . -100% C(.rodata)
0 . +512 . . . . 512 . cpu_mask_all(.data.read_mostly)
103573 -628 +512 -98295 -579 -842 -512 3229 -96% Totals
====== Sections (-l 500)
... files 13 vars 37 all 0 lim 500 unch 0
1 - initial
2 - cpumask_of_cpu
3 - add-CPUMASK_ALL_PTR
5 - nr_cpus-in-kernel_sched
6 - x86-set_cpus_allowed
7 - generic-set_cpus_allowed
8 - cpuset_cpus_allowed
9 - cpumask_affinity
10 - numa_initmem_init
11 - node_to_cpumask_ptr
12 - kernel_sched_c
13 - use-CPUMASK_ALL_PTR
.1. .2. .3. .5. .6. .7. .8. .9. .10. .11. .12. .13. ..final..
75833950 -13010 +260 -98051 -2972 -2295 +644 +122 +8174 -465 -698 +1349 75727008 <1% Total
42810838 -7428 +39 +225 -757 -431 +215 +182 -1 -366 +945 +591 42804052 <1% .debug_info
6808830 -404 -211 -116 -316 -243 +42 +113 -22 +95 +1950 +14 6809732 <1% .debug_loc
4805176 +16 +512 . . . . . . . . . 4805704 <1% .data.read_mostly
3475017 -528 -16 +48 -128 -496 -256 -112 . -496 -32 -80 3472921 <1% .text
2720422 -895 -16 +26 -192 -146 -23 -4 -1 -164 +26 +81 2719114 <1% .debug_line
1775040 . . -98176 . . . . . . . . 1676864 -5% .bss
1395188 -245 -9 . -131 -95 -19 -17 . . +14 +103 1394789 <1% .debug_abbrev
1141392 -3104 -64 . -1408 -848 +752 -16 . +480 +208 +640 1138032 <1% .debug_ranges
1021159 +32 -512 . . -1152 -640 . . . -1377 -512 1016998 <1% .rodata
982688 . . . . . . . +8208 . . . 990896 <1% .init.data
8080 -80 +464 . . +1152 +640 . . . -2720 +512 8048 <1% __param
142777780 -25646 +447 -196044 -5904 -4554 +1355 +268 +16358 -916 -1684 +2698 142564158 +0% Totals
====== Text/Data ()
... files 13 vars 6 all 0 lim 0 unch 0
1 - initial
2 - cpumask_of_cpu
5 - nr_cpus-in-kernel_sched
7 - generic-set_cpus_allowed
10 - numa_initmem_init
12 - kernel_sched_c
.1. .2. .5. .7. .10. .12. ..final..
3475456 -2048 . . . . 3473408 <1% TextSize
1738752 +2048 . -4096 . -4096 1732608 <1% DataSize
1775616 . -98304 . . . 1677312 -5% BssSize
1220608 . . . +8192 . 1228800 <1% InitSize
10399744 . . -4096 . +4096 10399744 . OtherSize
18610176 . -98304 -8192 +8192 . 18511872 +0% Totals
====== PerCPU ()
... files 13 vars 10 all 0 lim 0 unch 0
1 - initial
.1. ..final..
0 . +0% Totals
====== Stack (-l 500)
... files 13 vars 166 all 0 lim 500 unch 0
1 - initial
2 - cpumask_of_cpu
3 - add-CPUMASK_ALL_PTR
5 - nr_cpus-in-kernel_sched
6 - x86-set_cpus_allowed
7 - generic-set_cpus_allowed
8 - cpuset_cpus_allowed
9 - cpumask_affinity
10 - numa_initmem_init
11 - node_to_cpumask_ptr
12 - kernel_sched_c
13 - use-CPUMASK_ALL_PTR
.1. .2. .3. .5. .6. .7. .8. .9. .10. .11. .12. .13. ..final..
11080 . . . . . . . . -512 -8336 . 2232 -79% build_sched_domains
8248 . . . . . . . -8248 . . . . -100% numa_initmem_init
4648 . . -3840 . . . . . . -808 . . -100% cpu_attach_domain
3176 . -16 +16 . . . . . -1552 -1024 . 600 -81% sched_domain_node_span
3176 -512 . . -512 . . . . . . . 2152 -32% centrino_target
2584 -1024 . . . -512 . . . . . . 1048 -59% acpi_processor_set_throttling
2104 . . . . -1024 . . . . . . 1080 -48% _cpu_down
2088 -1024 . . -512 . . . . . . . 552 -73% powernowk8_cpu_init
2072 -512 . . . . . . . . . . 1560 -24% tick_notify
2056 . . . . . . -2056 . . . . . -100% affinity_set
1784 -1024 . . . . . . . . . . 760 -57% cpufreq_add_dev
1704 . . . . . . . . -1704 . . . -100% kswapd
1608 -512 . . -512 . . . . . . . 584 -63% powernowk8_target
1608 -1608 . . . . . . . . . . . -100% disable_smp
1608 -512 . . -512 . . . . . . . 584 -63% cache_add_dev
1592 . . . . . . . . . -1592 . . -100% do_tune_cpucache
1576 . . . . . . . . . -1576 . . -100% init_sched_build_groups
1560 . . . . -1040 . . . . . . 520 -66% pci_device_probe
1560 -512 . . -512 . . . . . . . 536 -65% check_supported_cpu
1544 . . . . . -512 +512 . . -512 . 1032 -33% sched_setaffinity
1544 -512 . . -520 . . . . . . . 512 -66% powernowk8_get
1544 -1008 . . . . . . . . . . 536 -65% alloc_ldt
1536 -504 . . . . . . . . . . 1032 -32% smp_call_function_single
1536 -1024 . . . . . . . . . . 512 -66% native_smp_send_reschedule
1536 -512 . . -504 . . . . . . . 520 -66% get_cur_freq
1536 -512 . . . -512 . . . . . . 512 -66% acpi_processor_get_throttling
1536 -512 . . -504 . . . . . . . 520 -66% acpi_processor_ffh_cstate_probe
1176 . . . . . . . . . -512 . 664 -43% thread_return
1176 . . . . . . . . . -512 . 664 -43% schedule
1160 . . . . . . . . . -512 . 648 -44% run_rebalance_domains
1160 . . . . . . . . -1160 . . . -100% __build_all_zonelists
1152 . . . . . -504 . . . . . 648 -43% cpuset_attach
1072 . . . . . -512 . . . . . 560 -47% pdflush
1064 . . . . . . . . -1064 . . . -100% cpuup_canceled
1064 . . . . . . . . . -1064 . . -100% cpuup_callback
1032 -1032 . . . . . . . . . . . -100% uv_target_cpus
1032 . . . . -520 . . . . . . 512 -50% system_kthread_notifier
1032 -1032 . . . . . . . . . . . -100% setup_pit_timer
1032 . . . . . . . . . -512 . 520 -49% sched_init_smp
1032 . . . . . . . . . . -520 512 -50% physflat_vector_allocation_domain
1032 . -1032 . . . . . . . . . . -100% kernel_init
1032 -1032 . . . . . . . . . . . -100% init_workqueues
1032 -1032 . . . . . . . . . . . -100% init_idle
1032 . . . . . . . . . . -512 520 -49% destroy_irq
1032 . . . . -1032 . . . . . . . -100% ____call_usermodehelper
1024 . . . . . . -512 . . . . 512 -50% sys_sched_setaffinity
1024 -504 . . . -520 . . . . . . . -100% stopmachine
1024 -1024 . . . . . . . . . . . -100% setup_APIC_timer
1024 -1024 . . . . . . . . . . . -100% native_smp_prepare_cpus
1024 -504 . . -520 . . . . . . . . -100% native_machine_shutdown
1024 . . . . -1024 . . . . . . . -100% kthreadd
1024 -1024 . . . . . . . . . . . -100% kthread_bind
1024 -1024 . . . . . . . . . . . -100% hpet_enable
1024 . . . . . . -512 . . . . 512 -50% compat_sys_sched_setaffinity
1024 . . . . . . . . . . -512 512 -50% __percpu_populate_mask
576 . . . . . -576 . . . . . . -100% cpuset_init
576 . . . . . -576 . . . . . . -100% cpuset_create
552 . . . . . . . . . -552 . . -100% migration_call
520 . . . . . . . . -520 . . . -100% node_read_cpumap
520 . . . . . . . . . . -520 . -100% dynamic_irq_init
520 . . . . . -520 . . . . . . -100% cpuset_cpus_allowed
520 . . . . . -520 . . . . . . -100% cpuset_change_cpumask
520 . . . . . . . . . -520 . . -100% cpu_to_phys_group
520 . . . . . . . . . -520 . . -100% cpu_to_core_group
520 . . . . . . -520 . . . . . -100% affinity_restore
512 . . . . . -512 . . . . . . -100% cpuset_cpus_allowed_locked
0 . . . . . . . . . +760 . 760 . sd_init_SIBLING
0 . . . . . . . . . +760 . 760 . sd_init_NODE
0 . . . . . . . . . +752 . 752 . sd_init_MC
0 . . . . . . . . . +752 . 752 . sd_init_CPU
0 . . . . . . . . . +752 . 752 . sd_init_ALLNODES
0 . . . . . . . . . +512 . 512 . detach_destroy_domains
103584 -21056 -1048 -3824 -4608 -6184 -4232 -3088 -8248 -6512 -14264 -2064 28456 -72% Totals
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 01/12] x86: Convert cpumask_of_cpu macro to allocated array
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 02/12] cpumask: add CPU_MASK_ALL_PTR macro Mike Travis
` (10 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel,
Ingo Molnar, Christoph Lameter
[-- Attachment #1: cpumask_of_cpu --]
[-- Type: text/plain, Size: 3484 bytes --]
* Here is a simple patch to use an allocated array of cpumasks to
represent cpumask_of_cpu() instead of constructing one on the stack.
It's based on the Kconfig option "HAVE_CPUMASK_OF_CPU_MAP" which is
currently only set for x86_64 SMP. Otherwise the the existing
cpumask_of_cpu() is used but has been changed to produce an lvalue
so a pointer to it can be used.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/Kconfig | 3 +++
arch/x86/kernel/setup.c | 28 +++++++++++++++++++++++++++-
include/linux/cpumask.h | 12 +++++++++---
3 files changed, 39 insertions(+), 4 deletions(-)
--- linux-2.6.x86.orig/arch/x86/Kconfig
+++ linux-2.6.x86/arch/x86/Kconfig
@@ -124,6 +124,9 @@ config ARCH_HAS_CPU_RELAX
config HAVE_SETUP_PER_CPU_AREA
def_bool X86_64 || (X86_SMP && !X86_VOYAGER)
+config HAVE_CPUMASK_OF_CPU_MAP
+ def_bool X86_64_SMP
+
config ARCH_HIBERNATION_POSSIBLE
def_bool y
depends on !SMP || !X86_VOYAGER
--- linux-2.6.x86.orig/arch/x86/kernel/setup.c
+++ linux-2.6.x86/arch/x86/kernel/setup.c
@@ -38,6 +38,24 @@ static void __init setup_per_cpu_maps(vo
#endif
}
+#ifdef CONFIG_HAVE_CPUMASK_OF_CPU_MAP
+cpumask_t *cpumask_of_cpu_map __read_mostly;
+EXPORT_SYMBOL(cpumask_of_cpu_map);
+
+/* requires nr_cpu_ids to be initialized */
+static void __init setup_cpumask_of_cpu(void)
+{
+ int i;
+
+ /* alloc_bootmem zeroes memory */
+ cpumask_of_cpu_map = alloc_bootmem_low(sizeof(cpumask_t) * nr_cpu_ids);
+ for (i = 0; i < nr_cpu_ids; i++)
+ cpu_set(i, cpumask_of_cpu_map[i]);
+}
+#else
+static inline void setup_cpumask_of_cpu(void) { }
+#endif
+
#ifdef CONFIG_X86_32
/*
* Great future not-so-futuristic plan: make i386 and x86_64 do it
@@ -54,7 +72,7 @@ EXPORT_SYMBOL(__per_cpu_offset);
*/
void __init setup_per_cpu_areas(void)
{
- int i;
+ int i, highest_cpu = 0;
unsigned long size;
#ifdef CONFIG_HOTPLUG_CPU
@@ -88,10 +106,18 @@ void __init setup_per_cpu_areas(void)
__per_cpu_offset[i] = ptr - __per_cpu_start;
#endif
memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start);
+
+ highest_cpu = i;
}
+ nr_cpu_ids = highest_cpu + 1;
+ printk(KERN_DEBUG "NR_CPUS: %d, nr_cpu_ids: %d\n", NR_CPUS, nr_cpu_ids);
+
/* Setup percpu data maps */
setup_per_cpu_maps();
+
+ /* Setup cpumask_of_cpu map */
+ setup_cpumask_of_cpu();
}
#endif
--- linux-2.6.x86.orig/include/linux/cpumask.h
+++ linux-2.6.x86/include/linux/cpumask.h
@@ -222,8 +222,13 @@ int __next_cpu(int n, const cpumask_t *s
#define next_cpu(n, src) ({ (void)(src); 1; })
#endif
+#ifdef CONFIG_HAVE_CPUMASK_OF_CPU_MAP
+extern cpumask_t *cpumask_of_cpu_map;
+#define cpumask_of_cpu(cpu) (cpumask_of_cpu_map[cpu])
+
+#else
#define cpumask_of_cpu(cpu) \
-({ \
+(*({ \
typeof(_unused_cpumask_arg_) m; \
if (sizeof(m) == sizeof(unsigned long)) { \
m.bits[0] = 1UL<<(cpu); \
@@ -231,8 +236,9 @@ int __next_cpu(int n, const cpumask_t *s
cpus_clear(m); \
cpu_set((cpu), m); \
} \
- m; \
-})
+ &m; \
+}))
+#endif
#define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 02/12] cpumask: add CPU_MASK_ALL_PTR macro
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
2008-04-05 1:11 ` [PATCH 01/12] x86: Convert cpumask_of_cpu macro to allocated array Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 03/12] cpumask: reduce stack pressure in cpu_coregroup_map Mike Travis
` (9 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: add-CPUMASK_ALL_PTR --]
[-- Type: text/plain, Size: 2132 bytes --]
* Add a static cpumask_t variable "CPU_MASK_ALL_PTR" to use as
a pointer reference to CPU_MASK_ALL. This reduces where possible
the instances where CPU_MASK_ALL allocates and fills a large
array on the stack. Used only if NR_CPUS > BITS_PER_LONG.
* Change init/main.c to use new set_cpus_allowed_ptr().
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
include/linux/cpumask.h | 6 ++++++
init/main.c | 7 ++++++-
2 files changed, 12 insertions(+), 1 deletion(-)
--- linux-2.6.x86.orig/include/linux/cpumask.h
+++ linux-2.6.x86/include/linux/cpumask.h
@@ -249,6 +249,8 @@ extern cpumask_t *cpumask_of_cpu_map;
[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD \
} }
+#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)
+
#else
#define CPU_MASK_ALL \
@@ -257,6 +259,10 @@ extern cpumask_t *cpumask_of_cpu_map;
[BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD \
} }
+/* cpu_mask_all is in init/main.c */
+extern cpumask_t cpu_mask_all;
+#define CPU_MASK_ALL_PTR (&cpu_mask_all)
+
#endif
#define CPU_MASK_NONE \
--- linux-2.6.x86.orig/init/main.c
+++ linux-2.6.x86/init/main.c
@@ -366,6 +366,11 @@ static inline void smp_prepare_cpus(unsi
#else
+#if NR_CPUS > BITS_PER_LONG
+cpumask_t cpu_mask_all __read_mostly = CPU_MASK_ALL;
+EXPORT_SYMBOL(cpu_mask_all);
+#endif
+
/* Setup number of possible processor ids */
int nr_cpu_ids __read_mostly = NR_CPUS;
EXPORT_SYMBOL(nr_cpu_ids);
@@ -837,7 +842,7 @@ static int __init kernel_init(void * unu
/*
* init can run on any cpu.
*/
- set_cpus_allowed(current, CPU_MASK_ALL);
+ set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);
/*
* Tell the world that we're going to be the grim
* reaper of innocent orphaned children.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 03/12] cpumask: reduce stack pressure in cpu_coregroup_map
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
2008-04-05 1:11 ` [PATCH 01/12] x86: Convert cpumask_of_cpu macro to allocated array Mike Travis
2008-04-05 1:11 ` [PATCH 02/12] cpumask: add CPU_MASK_ALL_PTR macro Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2 Mike Travis
` (8 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel,
David S. Miller, William L. Irwin
[-- Attachment #1: cpu_coregroup_map --]
[-- Type: text/plain, Size: 3289 bytes --]
* Return pointer to requested cpumask_t value for the cpu_coregroup_map()
functions instead of returning the cpumask_t value on the stack.
The only uses of these functions are in the sparc and x86 architectures.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
# sparc
Cc: David S. Miller <davem@davemloft.net>
Cc: William L. Irwin <wli@holomorphy.com>
# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/smpboot.c | 6 +++---
include/asm-sparc64/topology.h | 2 +-
include/asm-x86/topology.h | 2 +-
kernel/sched.c | 6 +++---
4 files changed, 8 insertions(+), 8 deletions(-)
--- linux-2.6.x86.orig/arch/x86/kernel/smpboot.c
+++ linux-2.6.x86/arch/x86/kernel/smpboot.c
@@ -545,7 +545,7 @@ void __cpuinit set_cpu_sibling_map(int c
}
/* maps the cpu to the sched domain representing multi-core */
-cpumask_t cpu_coregroup_map(int cpu)
+const cpumask_t *cpu_coregroup_map(int cpu)
{
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
@@ -553,9 +553,9 @@ cpumask_t cpu_coregroup_map(int cpu)
* And for power savings, we return cpu_core_map
*/
if (sched_mc_power_savings || sched_smt_power_savings)
- return per_cpu(cpu_core_map, cpu);
+ return &per_cpu(cpu_core_map, cpu);
else
- return c->llc_shared_map;
+ return &c->llc_shared_map;
}
/*
--- linux-2.6.x86.orig/include/asm-sparc64/topology.h
+++ linux-2.6.x86/include/asm-sparc64/topology.h
@@ -12,6 +12,6 @@
#include <asm-generic/topology.h>
-#define cpu_coregroup_map(cpu) (cpu_core_map[cpu])
+#define cpu_coregroup_map(cpu) (&cpu_core_map[cpu])
#endif /* _ASM_SPARC64_TOPOLOGY_H */
--- linux-2.6.x86.orig/include/asm-x86/topology.h
+++ linux-2.6.x86/include/asm-x86/topology.h
@@ -201,7 +201,7 @@ static inline void set_mp_bus_to_node(in
#include <asm-generic/topology.h>
-extern cpumask_t cpu_coregroup_map(int cpu);
+const cpumask_t *cpu_coregroup_map(int cpu);
#ifdef ENABLE_TOPO_DEFINES
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
--- linux-2.6.x86.orig/kernel/sched.c
+++ linux-2.6.x86/kernel/sched.c
@@ -6615,7 +6615,7 @@ cpu_to_phys_group(int cpu, const cpumask
{
int group;
#ifdef CONFIG_SCHED_MC
- cpumask_t mask = cpu_coregroup_map(cpu);
+ cpumask_t mask = *cpu_coregroup_map(cpu);
cpus_and(mask, mask, *cpu_map);
group = first_cpu(mask);
#elif defined(CONFIG_SCHED_SMT)
@@ -6849,7 +6849,7 @@ static int build_sched_domains(const cpu
p = sd;
sd = &per_cpu(core_domains, i);
*sd = SD_MC_INIT;
- sd->span = cpu_coregroup_map(i);
+ sd->span = *cpu_coregroup_map(i);
cpus_and(sd->span, sd->span, *cpu_map);
sd->parent = p;
p->child = sd;
@@ -6884,7 +6884,7 @@ static int build_sched_domains(const cpu
#ifdef CONFIG_SCHED_MC
/* Set up multi-core groups */
for_each_cpu_mask(i, *cpu_map) {
- cpumask_t this_core_map = cpu_coregroup_map(i);
+ cpumask_t this_core_map = *cpu_coregroup_map(i);
cpus_and(this_core_map, this_core_map, *cpu_map);
if (i != first_cpu(this_core_map))
continue;
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (2 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 03/12] cpumask: reduce stack pressure in cpu_coregroup_map Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-22 16:16 ` Tony Luck
2008-04-05 1:11 ` [PATCH 05/12] x86: use new set_cpus_allowed_ptr function Mike Travis
` (7 subsequent siblings)
11 siblings, 1 reply; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: nr_cpus-in-kernel_sched --]
[-- Type: text/plain, Size: 6893 bytes --]
* Change fixed size arrays to per_cpu variables or dynamically allocated
arrays in sched_init() and sched_init_smp().
(1) static struct sched_entity *init_sched_entity_p[NR_CPUS];
(1) static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
(1) static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS];
(1) static struct rt_rq *init_rt_rq_p[NR_CPUS];
static struct sched_group **sched_group_nodes_bycpu[NR_CPUS];
(1) - these arrays are allocated via alloc_bootmem_low()
* Change sched_domain_debug_one() to use cpulist_scnprintf instead of
cpumask_scnprintf. This reduces the output buffer required and improves
readability when large NR_CPU count machines arrive.
* In sched_create_group() we allocate new arrays based on nr_cpu_ids.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Travis <travis@sgi.com>
---
v2: Removed reference to cpumask_scnprintf_len().
---
kernel/sched.c | 80 +++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 52 insertions(+), 28 deletions(-)
--- linux-2.6.x86.orig/kernel/sched.c
+++ linux-2.6.x86/kernel/sched.c
@@ -68,6 +68,7 @@
#include <linux/hrtimer.h>
#include <linux/ftrace.h>
#include <linux/tick.h>
+#include <linux/bootmem.h>
#include <asm/tlb.h>
#include <asm/irq_regs.h>
@@ -278,17 +279,11 @@ struct task_group {
static DEFINE_PER_CPU(struct sched_entity, init_sched_entity);
/* Default task group's cfs_rq on each cpu */
static DEFINE_PER_CPU(struct cfs_rq, init_cfs_rq) ____cacheline_aligned_in_smp;
-
-static struct sched_entity *init_sched_entity_p[NR_CPUS];
-static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
#endif
#ifdef CONFIG_RT_GROUP_SCHED
static DEFINE_PER_CPU(struct sched_rt_entity, init_sched_rt_entity);
static DEFINE_PER_CPU(struct rt_rq, init_rt_rq) ____cacheline_aligned_in_smp;
-
-static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS];
-static struct rt_rq *init_rt_rq_p[NR_CPUS];
#endif
/* task_group_lock serializes add/remove of task groups and also changes to
@@ -312,17 +307,7 @@ static int init_task_group_load = INIT_T
/* Default task group.
* Every task in system belong to this group at bootup.
*/
-struct task_group init_task_group = {
-#ifdef CONFIG_FAIR_GROUP_SCHED
- .se = init_sched_entity_p,
- .cfs_rq = init_cfs_rq_p,
-#endif
-
-#ifdef CONFIG_RT_GROUP_SCHED
- .rt_se = init_sched_rt_entity_p,
- .rt_rq = init_rt_rq_p,
-#endif
-};
+struct task_group init_task_group;
/* return group to which a task belongs */
static inline struct task_group *task_group(struct task_struct *p)
@@ -3754,7 +3739,7 @@ static inline void trigger_load_balance(
*/
int ilb = first_cpu(nohz.cpu_mask);
- if (ilb != NR_CPUS)
+ if (ilb < nr_cpu_ids)
resched_cpu(ilb);
}
}
@@ -5729,11 +5714,11 @@ static void move_task_off_dead_cpu(int d
dest_cpu = any_online_cpu(mask);
/* On any allowed CPU? */
- if (dest_cpu == NR_CPUS)
+ if (dest_cpu >= nr_cpu_ids)
dest_cpu = any_online_cpu(p->cpus_allowed);
/* No more Mr. Nice Guy. */
- if (dest_cpu == NR_CPUS) {
+ if (dest_cpu >= nr_cpu_ids) {
cpumask_t cpus_allowed = cpuset_cpus_allowed_locked(p);
/*
* Try to stay on the same cpuset, where the
@@ -6188,9 +6173,9 @@ static int sched_domain_debug_one(struct
{
struct sched_group *group = sd->groups;
cpumask_t groupmask;
- char str[NR_CPUS];
+ char str[256];
- cpumask_scnprintf(str, NR_CPUS, sd->span);
+ cpulist_scnprintf(str, sizeof(str), sd->span);
cpus_clear(groupmask);
printk(KERN_DEBUG "%*s domain %d: ", level, "", level);
@@ -6243,7 +6228,7 @@ static int sched_domain_debug_one(struct
cpus_or(groupmask, groupmask, group->cpumask);
- cpumask_scnprintf(str, NR_CPUS, group->cpumask);
+ cpulist_scnprintf(str, sizeof(str), group->cpumask);
printk(KERN_CONT " %s", str);
group = group->next;
@@ -6637,7 +6622,7 @@ cpu_to_phys_group(int cpu, const cpumask
* gets dynamically allocated.
*/
static DEFINE_PER_CPU(struct sched_domain, node_domains);
-static struct sched_group **sched_group_nodes_bycpu[NR_CPUS];
+static struct sched_group ***sched_group_nodes_bycpu;
static DEFINE_PER_CPU(struct sched_domain, allnodes_domains);
static DEFINE_PER_CPU(struct sched_group, sched_group_allnodes);
@@ -7280,6 +7265,11 @@ void __init sched_init_smp(void)
{
cpumask_t non_isolated_cpus;
+#if defined(CONFIG_NUMA)
+ sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
+ GFP_KERNEL);
+ BUG_ON(sched_group_nodes_bycpu == NULL);
+#endif
get_online_cpus();
arch_init_sched_domains(&cpu_online_map);
non_isolated_cpus = cpu_possible_map;
@@ -7297,6 +7287,11 @@ void __init sched_init_smp(void)
#else
void __init sched_init_smp(void)
{
+#if defined(CONFIG_NUMA)
+ sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
+ GFP_KERNEL);
+ BUG_ON(sched_group_nodes_bycpu == NULL);
+#endif
sched_init_granularity();
}
#endif /* CONFIG_SMP */
@@ -7393,6 +7388,35 @@ static void init_tg_rt_entry(struct rq *
void __init sched_init(void)
{
int i, j;
+ unsigned long alloc_size = 0, ptr;
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+ alloc_size += 2 * nr_cpu_ids * sizeof(void **);
+#endif
+#ifdef CONFIG_RT_GROUP_SCHED
+ alloc_size += 2 * nr_cpu_ids * sizeof(void **);
+#endif
+ /*
+ * As sched_init() is called before page_alloc is setup,
+ * we use alloc_bootmem().
+ */
+ if (alloc_size) {
+ ptr = (unsigned long)alloc_bootmem_low(alloc_size);
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+ init_task_group.se = (struct sched_entity **)ptr;
+ ptr += nr_cpu_ids * sizeof(void **);
+
+ init_task_group.cfs_rq = (struct cfs_rq **)ptr;
+ ptr += nr_cpu_ids * sizeof(void **);
+#endif
+#ifdef CONFIG_RT_GROUP_SCHED
+ init_task_group.rt_se = (struct sched_rt_entity **)ptr;
+ ptr += nr_cpu_ids * sizeof(void **);
+
+ init_task_group.rt_rq = (struct rt_rq **)ptr;
+#endif
+ }
#ifdef CONFIG_SMP
init_defrootdomain();
@@ -7643,10 +7667,10 @@ static int alloc_fair_sched_group(struct
struct rq *rq;
int i;
- tg->cfs_rq = kzalloc(sizeof(cfs_rq) * NR_CPUS, GFP_KERNEL);
+ tg->cfs_rq = kzalloc(sizeof(cfs_rq) * nr_cpu_ids, GFP_KERNEL);
if (!tg->cfs_rq)
goto err;
- tg->se = kzalloc(sizeof(se) * NR_CPUS, GFP_KERNEL);
+ tg->se = kzalloc(sizeof(se) * nr_cpu_ids, GFP_KERNEL);
if (!tg->se)
goto err;
@@ -7728,10 +7752,10 @@ static int alloc_rt_sched_group(struct t
struct rq *rq;
int i;
- tg->rt_rq = kzalloc(sizeof(rt_rq) * NR_CPUS, GFP_KERNEL);
+ tg->rt_rq = kzalloc(sizeof(rt_rq) * nr_cpu_ids, GFP_KERNEL);
if (!tg->rt_rq)
goto err;
- tg->rt_se = kzalloc(sizeof(rt_se) * NR_CPUS, GFP_KERNEL);
+ tg->rt_se = kzalloc(sizeof(rt_se) * nr_cpu_ids, GFP_KERNEL);
if (!tg->rt_se)
goto err;
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 05/12] x86: use new set_cpus_allowed_ptr function
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (3 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2 Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 06/12] generic: " Mike Travis
` (6 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel,
Len Brown, Dave Jones
[-- Attachment #1: x86-set_cpus_allowed --]
[-- Type: text/plain, Size: 16403 bytes --]
* Use new set_cpus_allowed_ptr() function added by previous patch,
which instead of passing the "newly allowed cpus" cpumask_t arg
by value, pass it by pointer:
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
* Cleanup uses of CPU_MASK_ALL.
* Collapse other NR_CPUS changes to arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
Use pointers to cpumask_t arguments whenever possible.
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
# acpi/cpufreq
Cc: Len Brown <len.brown@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/acpi/cstate.c | 4 +-
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 28 ++++++++++----------
arch/x86/kernel/cpu/cpufreq/powernow-k8.c | 32 ++++++++++++-----------
arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c | 13 +++++----
arch/x86/kernel/cpu/cpufreq/speedstep-ich.c | 20 +++++++-------
arch/x86/kernel/cpu/intel_cacheinfo.c | 4 +-
arch/x86/kernel/microcode.c | 16 +++++------
arch/x86/kernel/reboot.c | 2 -
8 files changed, 61 insertions(+), 58 deletions(-)
--- linux-2.6.x86.orig/arch/x86/kernel/acpi/cstate.c
+++ linux-2.6.x86/arch/x86/kernel/acpi/cstate.c
@@ -91,7 +91,7 @@ int acpi_processor_ffh_cstate_probe(unsi
/* Make sure we are running on right CPU */
saved_mask = current->cpus_allowed;
- retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ retval = set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
if (retval)
return -1;
@@ -128,7 +128,7 @@ int acpi_processor_ffh_cstate_probe(unsi
cx->address);
out:
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return retval;
}
EXPORT_SYMBOL_GPL(acpi_processor_ffh_cstate_probe);
--- linux-2.6.x86.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -192,9 +192,9 @@ static void drv_read(struct drv_cmd *cmd
cpumask_t saved_mask = current->cpus_allowed;
cmd->val = 0;
- set_cpus_allowed(current, cmd->mask);
+ set_cpus_allowed_ptr(current, &cmd->mask);
do_drv_read(cmd);
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
}
static void drv_write(struct drv_cmd *cmd)
@@ -203,30 +203,30 @@ static void drv_write(struct drv_cmd *cm
unsigned int i;
for_each_cpu_mask(i, cmd->mask) {
- set_cpus_allowed(current, cpumask_of_cpu(i));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(i));
do_drv_write(cmd);
}
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return;
}
-static u32 get_cur_val(cpumask_t mask)
+static u32 get_cur_val(const cpumask_t *mask)
{
struct acpi_processor_performance *perf;
struct drv_cmd cmd;
- if (unlikely(cpus_empty(mask)))
+ if (unlikely(cpus_empty(*mask)))
return 0;
- switch (per_cpu(drv_data, first_cpu(mask))->cpu_feature) {
+ switch (per_cpu(drv_data, first_cpu(*mask))->cpu_feature) {
case SYSTEM_INTEL_MSR_CAPABLE:
cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
cmd.addr.msr.reg = MSR_IA32_PERF_STATUS;
break;
case SYSTEM_IO_CAPABLE:
cmd.type = SYSTEM_IO_CAPABLE;
- perf = per_cpu(drv_data, first_cpu(mask))->acpi_data;
+ perf = per_cpu(drv_data, first_cpu(*mask))->acpi_data;
cmd.addr.io.port = perf->control_register.address;
cmd.addr.io.bit_width = perf->control_register.bit_width;
break;
@@ -234,7 +234,7 @@ static u32 get_cur_val(cpumask_t mask)
return 0;
}
- cmd.mask = mask;
+ cmd.mask = *mask;
drv_read(&cmd);
@@ -271,7 +271,7 @@ static unsigned int get_measured_perf(un
unsigned int retval;
saved_mask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
if (get_cpu() != cpu) {
/* We were not able to run on requested processor */
put_cpu();
@@ -329,7 +329,7 @@ static unsigned int get_measured_perf(un
retval = per_cpu(drv_data, cpu)->max_freq * perf_percent / 100;
put_cpu();
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
dprintk("cpu %d: performance percent %d\n", cpu, perf_percent);
return retval;
@@ -347,13 +347,13 @@ static unsigned int get_cur_freq_on_cpu(
return 0;
}
- freq = extract_freq(get_cur_val(cpumask_of_cpu(cpu)), data);
+ freq = extract_freq(get_cur_val(&cpumask_of_cpu(cpu)), data);
dprintk("cur freq = %u\n", freq);
return freq;
}
-static unsigned int check_freqs(cpumask_t mask, unsigned int freq,
+static unsigned int check_freqs(const cpumask_t *mask, unsigned int freq,
struct acpi_cpufreq_data *data)
{
unsigned int cur_freq;
@@ -449,7 +449,7 @@ static int acpi_cpufreq_target(struct cp
drv_write(&cmd);
if (acpi_pstate_strict) {
- if (!check_freqs(cmd.mask, freqs.new, data)) {
+ if (!check_freqs(&cmd.mask, freqs.new, data)) {
dprintk("acpi_cpufreq_target failed (%d)\n",
policy->cpu);
return -EAGAIN;
--- linux-2.6.x86.orig/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -478,12 +478,12 @@ static int core_voltage_post_transition(
static int check_supported_cpu(unsigned int cpu)
{
- cpumask_t oldmask = CPU_MASK_ALL;
+ cpumask_t oldmask;
u32 eax, ebx, ecx, edx;
unsigned int rc = 0;
oldmask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
if (smp_processor_id() != cpu) {
printk(KERN_ERR PFX "limiting to cpu %u failed\n", cpu);
@@ -528,7 +528,7 @@ static int check_supported_cpu(unsigned
rc = 1;
out:
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
return rc;
}
@@ -1015,7 +1015,7 @@ static int transition_frequency_pstate(s
/* Driver entry point to switch to the target frequency */
static int powernowk8_target(struct cpufreq_policy *pol, unsigned targfreq, unsigned relation)
{
- cpumask_t oldmask = CPU_MASK_ALL;
+ cpumask_t oldmask;
struct powernow_k8_data *data = per_cpu(powernow_data, pol->cpu);
u32 checkfid;
u32 checkvid;
@@ -1030,7 +1030,7 @@ static int powernowk8_target(struct cpuf
/* only run on specific CPU from here on */
oldmask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(pol->cpu));
if (smp_processor_id() != pol->cpu) {
printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1085,7 +1085,7 @@ static int powernowk8_target(struct cpuf
ret = 0;
err_out:
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
return ret;
}
@@ -1104,7 +1104,7 @@ static int powernowk8_verify(struct cpuf
static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
{
struct powernow_k8_data *data;
- cpumask_t oldmask = CPU_MASK_ALL;
+ cpumask_t oldmask;
int rc;
if (!cpu_online(pol->cpu))
@@ -1145,7 +1145,7 @@ static int __cpuinit powernowk8_cpu_init
/* only run on specific CPU from here on */
oldmask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(pol->cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(pol->cpu));
if (smp_processor_id() != pol->cpu) {
printk(KERN_ERR PFX "limiting to cpu %u failed\n", pol->cpu);
@@ -1164,7 +1164,7 @@ static int __cpuinit powernowk8_cpu_init
fidvid_msr_init();
/* run on any CPU again */
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
if (cpu_family == CPU_HW_PSTATE)
pol->cpus = cpumask_of_cpu(pol->cpu);
@@ -1205,7 +1205,7 @@ static int __cpuinit powernowk8_cpu_init
return 0;
err_out:
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
powernow_k8_cpu_exit_acpi(data);
kfree(data);
@@ -1242,10 +1242,11 @@ static unsigned int powernowk8_get (unsi
if (!data)
return -EINVAL;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
if (smp_processor_id() != cpu) {
- printk(KERN_ERR PFX "limiting to CPU %d failed in powernowk8_get\n", cpu);
- set_cpus_allowed(current, oldmask);
+ printk(KERN_ERR PFX
+ "limiting to CPU %d failed in powernowk8_get\n", cpu);
+ set_cpus_allowed_ptr(current, &oldmask);
return 0;
}
@@ -1253,13 +1254,14 @@ static unsigned int powernowk8_get (unsi
goto out;
if (cpu_family == CPU_HW_PSTATE)
- khz = find_khz_freq_from_pstate(data->powernow_table, data->currpstate);
+ khz = find_khz_freq_from_pstate(data->powernow_table,
+ data->currpstate);
else
khz = find_khz_freq_from_fid(data->currfid);
out:
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
return khz;
}
--- linux-2.6.x86.orig/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c
@@ -315,7 +315,7 @@ static unsigned int get_cur_freq(unsigne
cpumask_t saved_mask;
saved_mask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
if (smp_processor_id() != cpu)
return 0;
@@ -333,7 +333,7 @@ static unsigned int get_cur_freq(unsigne
clock_freq = extract_clock(l, cpu, 1);
}
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return clock_freq;
}
@@ -487,7 +487,7 @@ static int centrino_target (struct cpufr
else
cpu_set(j, set_mask);
- set_cpus_allowed(current, set_mask);
+ set_cpus_allowed_ptr(current, &set_mask);
preempt_disable();
if (unlikely(!cpu_isset(smp_processor_id(), set_mask))) {
dprintk("couldn't limit to CPUs in this domain\n");
@@ -555,7 +555,8 @@ static int centrino_target (struct cpufr
if (!cpus_empty(covered_cpus)) {
for_each_cpu_mask(j, covered_cpus) {
- set_cpus_allowed(current, cpumask_of_cpu(j));
+ set_cpus_allowed_ptr(current,
+ &cpumask_of_cpu(j));
wrmsr(MSR_IA32_PERF_CTL, oldmsr, h);
}
}
@@ -569,12 +570,12 @@ static int centrino_target (struct cpufr
cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
}
}
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return 0;
migrate_end:
preempt_enable();
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return 0;
}
--- linux-2.6.x86.orig/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/cpufreq/speedstep-ich.c
@@ -229,22 +229,22 @@ static unsigned int speedstep_detect_chi
return 0;
}
-static unsigned int _speedstep_get(cpumask_t cpus)
+static unsigned int _speedstep_get(const cpumask_t *cpus)
{
unsigned int speed;
cpumask_t cpus_allowed;
cpus_allowed = current->cpus_allowed;
- set_cpus_allowed(current, cpus);
+ set_cpus_allowed_ptr(current, cpus);
speed = speedstep_get_processor_frequency(speedstep_processor);
- set_cpus_allowed(current, cpus_allowed);
+ set_cpus_allowed_ptr(current, &cpus_allowed);
dprintk("detected %u kHz as current frequency\n", speed);
return speed;
}
static unsigned int speedstep_get(unsigned int cpu)
{
- return _speedstep_get(cpumask_of_cpu(cpu));
+ return _speedstep_get(&cpumask_of_cpu(cpu));
}
/**
@@ -267,7 +267,7 @@ static int speedstep_target (struct cpuf
if (cpufreq_frequency_table_target(policy, &speedstep_freqs[0], target_freq, relation, &newstate))
return -EINVAL;
- freqs.old = _speedstep_get(policy->cpus);
+ freqs.old = _speedstep_get(&policy->cpus);
freqs.new = speedstep_freqs[newstate].frequency;
freqs.cpu = policy->cpu;
@@ -285,12 +285,12 @@ static int speedstep_target (struct cpuf
}
/* switch to physical CPU where state is to be changed */
- set_cpus_allowed(current, policy->cpus);
+ set_cpus_allowed_ptr(current, &policy->cpus);
speedstep_set_state(newstate);
/* allow to be run on all CPUs */
- set_cpus_allowed(current, cpus_allowed);
+ set_cpus_allowed_ptr(current, &cpus_allowed);
for_each_cpu_mask(i, policy->cpus) {
freqs.cpu = i;
@@ -326,7 +326,7 @@ static int speedstep_cpu_init(struct cpu
#endif
cpus_allowed = current->cpus_allowed;
- set_cpus_allowed(current, policy->cpus);
+ set_cpus_allowed_ptr(current, &policy->cpus);
/* detect low and high frequency and transition latency */
result = speedstep_get_freqs(speedstep_processor,
@@ -334,12 +334,12 @@ static int speedstep_cpu_init(struct cpu
&speedstep_freqs[SPEEDSTEP_HIGH].frequency,
&policy->cpuinfo.transition_latency,
&speedstep_set_state);
- set_cpus_allowed(current, cpus_allowed);
+ set_cpus_allowed_ptr(current, &cpus_allowed);
if (result)
return result;
/* get current speed setting */
- speed = _speedstep_get(policy->cpus);
+ speed = _speedstep_get(&policy->cpus);
if (!speed)
return -EIO;
--- linux-2.6.x86.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -525,7 +525,7 @@ static int __cpuinit detect_cache_attrib
return -ENOMEM;
oldmask = current->cpus_allowed;
- retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ retval = set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
if (retval)
goto out;
@@ -542,7 +542,7 @@ static int __cpuinit detect_cache_attrib
}
cache_shared_cpu_map_setup(cpu, j);
}
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
out:
if (retval) {
--- linux-2.6.x86.orig/arch/x86/kernel/microcode.c
+++ linux-2.6.x86/arch/x86/kernel/microcode.c
@@ -402,7 +402,7 @@ static int do_microcode_update (void)
if (!uci->valid)
continue;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
error = get_maching_microcode(new_mc, cpu);
if (error < 0)
goto out;
@@ -416,7 +416,7 @@ out:
vfree(new_mc);
if (cursor < 0)
error = cursor;
- set_cpus_allowed(current, old);
+ set_cpus_allowed_ptr(current, &old);
return error;
}
@@ -579,7 +579,7 @@ static int apply_microcode_check_cpu(int
return 0;
old = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
/* Check if the microcode we have in memory matches the CPU */
if (c->x86_vendor != X86_VENDOR_INTEL || c->x86 < 6 ||
@@ -610,7 +610,7 @@ static int apply_microcode_check_cpu(int
" sig=0x%x, pf=0x%x, rev=0x%x\n",
cpu, uci->sig, uci->pf, uci->rev);
- set_cpus_allowed(current, old);
+ set_cpus_allowed_ptr(current, &old);
return err;
}
@@ -621,13 +621,13 @@ static void microcode_init_cpu(int cpu,
old = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
mutex_lock(µcode_mutex);
collect_cpu_info(cpu);
if (uci->valid && system_state == SYSTEM_RUNNING && !resume)
cpu_request_microcode(cpu);
mutex_unlock(µcode_mutex);
- set_cpus_allowed(current, old);
+ set_cpus_allowed_ptr(current, &old);
}
static void microcode_fini_cpu(int cpu)
@@ -657,14 +657,14 @@ static ssize_t reload_store(struct sys_d
old = current->cpus_allowed;
get_online_cpus();
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
mutex_lock(µcode_mutex);
if (uci->valid)
err = cpu_request_microcode(cpu);
mutex_unlock(µcode_mutex);
put_online_cpus();
- set_cpus_allowed(current, old);
+ set_cpus_allowed_ptr(current, &old);
}
if (err)
return err;
--- linux-2.6.x86.orig/arch/x86/kernel/reboot.c
+++ linux-2.6.x86/arch/x86/kernel/reboot.c
@@ -420,7 +420,7 @@ static void native_machine_shutdown(void
reboot_cpu_id = smp_processor_id();
/* Make certain I only run on the appropriate processor */
- set_cpus_allowed(current, cpumask_of_cpu(reboot_cpu_id));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(reboot_cpu_id));
/* O.K Now that I'm on the appropriate processor,
* stop all of the others.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 06/12] generic: use new set_cpus_allowed_ptr function
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (4 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 05/12] x86: use new set_cpus_allowed_ptr function Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 07/12] cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer Mike Travis
` (5 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: generic-set_cpus_allowed --]
[-- Type: text/plain, Size: 8244 bytes --]
* Use new set_cpus_allowed_ptr() function added by previous patch,
which instead of passing the "newly allowed cpus" cpumask_t arg
by value, pass it by pointer:
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
* Modify CPU_MASK_ALL
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
drivers/acpi/processor_throttling.c | 10 +++++-----
drivers/firmware/dcdbas.c | 4 ++--
drivers/pci/pci-driver.c | 9 ++++++---
kernel/cpu.c | 6 +++---
kernel/kmod.c | 2 +-
kernel/kthread.c | 6 +++---
kernel/rcutorture.c | 15 +++++++++------
kernel/stop_machine.c | 2 +-
kernel/trace/trace_sysprof.c | 4 ++--
9 files changed, 32 insertions(+), 26 deletions(-)
--- linux-2.6.x86.orig/drivers/acpi/processor_throttling.c
+++ linux-2.6.x86/drivers/acpi/processor_throttling.c
@@ -838,10 +838,10 @@ static int acpi_processor_get_throttling
* Migrate task to the cpu pointed by pr.
*/
saved_mask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
ret = pr->throttling.acpi_processor_get_throttling(pr);
/* restore the previous state */
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return ret;
}
@@ -1025,7 +1025,7 @@ int acpi_processor_set_throttling(struct
* it can be called only for the cpu pointed by pr.
*/
if (p_throttling->shared_type == DOMAIN_COORD_TYPE_SW_ANY) {
- set_cpus_allowed(current, cpumask_of_cpu(pr->id));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
ret = p_throttling->acpi_processor_set_throttling(pr,
t_state.target_state);
} else {
@@ -1056,7 +1056,7 @@ int acpi_processor_set_throttling(struct
continue;
}
t_state.cpu = i;
- set_cpus_allowed(current, cpumask_of_cpu(i));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(i));
ret = match_pr->throttling.
acpi_processor_set_throttling(
match_pr, t_state.target_state);
@@ -1074,7 +1074,7 @@ int acpi_processor_set_throttling(struct
&t_state);
}
/* restore the previous state */
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
return ret;
}
--- linux-2.6.x86.orig/drivers/firmware/dcdbas.c
+++ linux-2.6.x86/drivers/firmware/dcdbas.c
@@ -265,7 +265,7 @@ static int smi_request(struct smi_cmd *s
/* SMI requires CPU 0 */
old_mask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(0));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(0));
if (smp_processor_id() != 0) {
dev_dbg(&dcdbas_pdev->dev, "%s: failed to get CPU 0\n",
__FUNCTION__);
@@ -285,7 +285,7 @@ static int smi_request(struct smi_cmd *s
);
out:
- set_cpus_allowed(current, old_mask);
+ set_cpus_allowed_ptr(current, &old_mask);
return ret;
}
--- linux-2.6.x86.orig/drivers/pci/pci-driver.c
+++ linux-2.6.x86/drivers/pci/pci-driver.c
@@ -182,15 +182,18 @@ static int pci_call_probe(struct pci_dri
struct mempolicy *oldpol;
cpumask_t oldmask = current->cpus_allowed;
int node = dev_to_node(&dev->dev);
- if (node >= 0)
- set_cpus_allowed(current, node_to_cpumask(node));
+
+ if (node >= 0) {
+ node_to_cpumask_ptr(nodecpumask, node);
+ set_cpus_allowed_ptr(current, nodecpumask);
+ }
/* And set default memory allocation policy */
oldpol = current->mempolicy;
current->mempolicy = NULL; /* fall back to system default policy */
#endif
error = drv->probe(dev, id);
#ifdef CONFIG_NUMA
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, &oldmask);
current->mempolicy = oldpol;
#endif
return error;
--- linux-2.6.x86.orig/kernel/cpu.c
+++ linux-2.6.x86/kernel/cpu.c
@@ -232,9 +232,9 @@ static int _cpu_down(unsigned int cpu, i
/* Ensure that we are not runnable on dying cpu */
old_allowed = current->cpus_allowed;
- tmp = CPU_MASK_ALL;
+ cpus_setall(tmp);
cpu_clear(cpu, tmp);
- set_cpus_allowed(current, tmp);
+ set_cpus_allowed_ptr(current, &tmp);
p = __stop_machine_run(take_cpu_down, &tcd_param, cpu);
@@ -268,7 +268,7 @@ static int _cpu_down(unsigned int cpu, i
out_thread:
err = kthread_stop(p);
out_allowed:
- set_cpus_allowed(current, old_allowed);
+ set_cpus_allowed_ptr(current, &old_allowed);
out_release:
cpu_hotplug_done();
return err;
--- linux-2.6.x86.orig/kernel/kmod.c
+++ linux-2.6.x86/kernel/kmod.c
@@ -165,7 +165,7 @@ static int ____call_usermodehelper(void
}
/* We can run anywhere, unlike our parent keventd(). */
- set_cpus_allowed(current, CPU_MASK_ALL);
+ set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);
/*
* Our parent is keventd, which runs with elevated scheduling priority.
--- linux-2.6.x86.orig/kernel/kthread.c
+++ linux-2.6.x86/kernel/kthread.c
@@ -109,7 +109,7 @@ static void create_kthread(struct kthrea
*/
sched_setscheduler(create->result, SCHED_NORMAL, ¶m);
set_user_nice(create->result, KTHREAD_NICE_LEVEL);
- set_cpus_allowed(create->result, cpu_system_map);
+ set_cpus_allowed_ptr(create->result, &cpu_system_map);
}
complete(&create->done);
}
@@ -235,7 +235,7 @@ int kthreadd(void *unused)
set_task_comm(tsk, "kthreadd");
ignore_signals(tsk);
set_user_nice(tsk, KTHREAD_NICE_LEVEL);
- set_cpus_allowed(tsk, cpu_system_map);
+ set_cpus_allowed_ptr(tsk, &cpu_system_map);
current->flags |= PF_NOFREEZE;
@@ -284,7 +284,7 @@ again:
*/
get_task_struct(t);
rcu_read_unlock();
- set_cpus_allowed(t, *new_system_map);
+ set_cpus_allowed_ptr(t, new_system_map);
put_task_struct(t);
goto again;
}
--- linux-2.6.x86.orig/kernel/rcutorture.c
+++ linux-2.6.x86/kernel/rcutorture.c
@@ -723,9 +723,10 @@ static int rcu_idle_cpu; /* Force all to
*/
static void rcu_torture_shuffle_tasks(void)
{
- cpumask_t tmp_mask = CPU_MASK_ALL;
+ cpumask_t tmp_mask;
int i;
+ cpus_setall(tmp_mask);
get_online_cpus();
/* No point in shuffling if there is only one online CPU (ex: UP) */
@@ -737,25 +738,27 @@ static void rcu_torture_shuffle_tasks(vo
if (rcu_idle_cpu != -1)
cpu_clear(rcu_idle_cpu, tmp_mask);
- set_cpus_allowed(current, tmp_mask);
+ set_cpus_allowed_ptr(current, &tmp_mask);
if (reader_tasks) {
for (i = 0; i < nrealreaders; i++)
if (reader_tasks[i])
- set_cpus_allowed(reader_tasks[i], tmp_mask);
+ set_cpus_allowed_ptr(reader_tasks[i],
+ &tmp_mask);
}
if (fakewriter_tasks) {
for (i = 0; i < nfakewriters; i++)
if (fakewriter_tasks[i])
- set_cpus_allowed(fakewriter_tasks[i], tmp_mask);
+ set_cpus_allowed_ptr(fakewriter_tasks[i],
+ &tmp_mask);
}
if (writer_task)
- set_cpus_allowed(writer_task, tmp_mask);
+ set_cpus_allowed_ptr(writer_task, &tmp_mask);
if (stats_task)
- set_cpus_allowed(stats_task, tmp_mask);
+ set_cpus_allowed_ptr(stats_task, &tmp_mask);
if (rcu_idle_cpu == -1)
rcu_idle_cpu = num_online_cpus() - 1;
--- linux-2.6.x86.orig/kernel/stop_machine.c
+++ linux-2.6.x86/kernel/stop_machine.c
@@ -35,7 +35,7 @@ static int stopmachine(void *cpu)
int irqs_disabled = 0;
int prepared = 0;
- set_cpus_allowed(current, cpumask_of_cpu((int)(long)cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu((int)(long)cpu));
/* Ack: we are alive */
smp_mb(); /* Theoretically the ack = 0 might not be on this CPU yet. */
--- linux-2.6.x86.orig/kernel/trace/trace_sysprof.c
+++ linux-2.6.x86/kernel/trace/trace_sysprof.c
@@ -205,10 +205,10 @@ static void start_stack_timers(void)
int cpu;
for_each_online_cpu(cpu) {
- set_cpus_allowed(current, cpumask_of_cpu(cpu));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
start_stack_timer(cpu);
}
- set_cpus_allowed(current, saved_mask);
+ set_cpus_allowed_ptr(current, &saved_mask);
}
static void stop_stack_timer(int cpu)
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 07/12] cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (5 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 06/12] generic: " Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 08/12] generic: reduce stack pressure in sched_affinity Mike Travis
` (4 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: cpuset_cpus_allowed --]
[-- Type: text/plain, Size: 6360 bytes --]
* Modify cpuset_cpus_allowed to return the currently allowed cpuset
via a pointer argument instead of as the function return value.
* Use new set_cpus_allowed_ptr function.
* Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
include/linux/cpuset.h | 13 +++++++------
kernel/cpuset.c | 31 ++++++++++++-------------------
kernel/sched.c | 8 +++++---
mm/pdflush.c | 4 ++--
4 files changed, 26 insertions(+), 30 deletions(-)
--- linux-2.6.x86.orig/include/linux/cpuset.h
+++ linux-2.6.x86/include/linux/cpuset.h
@@ -20,8 +20,8 @@ extern int number_of_cpusets; /* How man
extern int cpuset_init_early(void);
extern int cpuset_init(void);
extern void cpuset_init_smp(void);
-extern cpumask_t cpuset_cpus_allowed(struct task_struct *p);
-extern cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p);
+extern void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask);
+extern void cpuset_cpus_allowed_locked(struct task_struct *p, cpumask_t *mask);
extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
#define cpuset_current_mems_allowed (current->mems_allowed)
void cpuset_init_current_mems_allowed(void);
@@ -86,13 +86,14 @@ static inline int cpuset_init_early(void
static inline int cpuset_init(void) { return 0; }
static inline void cpuset_init_smp(void) {}
-static inline cpumask_t cpuset_cpus_allowed(struct task_struct *p)
+static inline void cpuset_cpus_allowed(struct task_struct *p, cpumask_t *mask)
{
- return cpu_possible_map;
+ *mask = cpu_possible_map;
}
-static inline cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p)
+static inline void cpuset_cpus_allowed_locked(struct task_struct *p,
+ cpumask_t *mask)
{
- return cpu_possible_map;
+ *mask = cpu_possible_map;
}
static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
--- linux-2.6.x86.orig/kernel/cpuset.c
+++ linux-2.6.x86/kernel/cpuset.c
@@ -741,7 +741,7 @@ int cpuset_test_cpumask(struct task_stru
*/
void cpuset_change_cpumask(struct task_struct *tsk, struct cgroup_scanner *scan)
{
- set_cpus_allowed(tsk, (cgroup_cs(scan->cg))->cpus_allowed);
+ set_cpus_allowed_ptr(tsk, &((cgroup_cs(scan->cg))->cpus_allowed));
}
/**
@@ -1269,7 +1269,7 @@ static void cpuset_attach(struct cgroup_
mutex_lock(&callback_mutex);
guarantee_online_cpus(cs, &cpus);
- set_cpus_allowed(tsk, cpus);
+ set_cpus_allowed_ptr(tsk, &cpus);
mutex_unlock(&callback_mutex);
from = oldcs->mems_allowed;
@@ -1663,8 +1663,8 @@ static struct cgroup_subsys_state *cpuse
set_bit(CS_SPREAD_SLAB, &cs->flags);
set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
set_bit(CS_SYSTEM, &cs->flags);
- cs->cpus_allowed = CPU_MASK_NONE;
- cs->mems_allowed = NODE_MASK_NONE;
+ cpus_clear(cs->cpus_allowed);
+ nodes_clear(cs->mems_allowed);
cs->mems_generation = cpuset_mems_generation++;
fmeter_init(&cs->fmeter);
@@ -1737,8 +1737,8 @@ int __init cpuset_init(void)
{
int err = 0;
- top_cpuset.cpus_allowed = CPU_MASK_ALL;
- top_cpuset.mems_allowed = NODE_MASK_ALL;
+ cpus_setall(top_cpuset.cpus_allowed);
+ nodes_setall(top_cpuset.mems_allowed);
fmeter_init(&top_cpuset.fmeter);
top_cpuset.mems_generation = cpuset_mems_generation++;
@@ -1957,6 +1957,7 @@ void __init cpuset_init_smp(void)
* cpuset_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
* @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
+ * @pmask: pointer to cpumask_t variable to receive cpus_allowed set.
*
* Description: Returns the cpumask_t cpus_allowed of the cpuset
* attached to the specified @tsk. Guaranteed to return some non-empty
@@ -1964,35 +1965,27 @@ void __init cpuset_init_smp(void)
* tasks cpuset.
**/
-cpumask_t cpuset_cpus_allowed(struct task_struct *tsk)
+void cpuset_cpus_allowed(struct task_struct *tsk, cpumask_t *pmask)
{
- cpumask_t mask;
-
mutex_lock(&callback_mutex);
- mask = cpuset_cpus_allowed_locked(tsk);
+ cpuset_cpus_allowed_locked(tsk, pmask);
mutex_unlock(&callback_mutex);
-
- return mask;
}
/**
* cpuset_cpus_allowed_locked - return cpus_allowed mask from a tasks cpuset.
* Must be called with callback_mutex held.
**/
-cpumask_t cpuset_cpus_allowed_locked(struct task_struct *tsk)
+void cpuset_cpus_allowed_locked(struct task_struct *tsk, cpumask_t *pmask)
{
- cpumask_t mask;
-
task_lock(tsk);
- guarantee_online_cpus(task_cs(tsk), &mask);
+ guarantee_online_cpus(task_cs(tsk), pmask);
task_unlock(tsk);
-
- return mask;
}
void cpuset_init_current_mems_allowed(void)
{
- current->mems_allowed = NODE_MASK_ALL;
+ nodes_setall(current->mems_allowed);
}
/**
--- linux-2.6.x86.orig/kernel/sched.c
+++ linux-2.6.x86/kernel/sched.c
@@ -4996,13 +4996,13 @@ long sched_setaffinity(pid_t pid, cpumas
if (retval)
goto out_unlock;
- cpus_allowed = cpuset_cpus_allowed(p);
+ cpuset_cpus_allowed(p, &cpus_allowed);
cpus_and(new_mask, new_mask, cpus_allowed);
again:
retval = set_cpus_allowed(p, new_mask);
if (!retval) {
- cpus_allowed = cpuset_cpus_allowed(p);
+ cpuset_cpus_allowed(p, &cpus_allowed);
if (!cpus_subset(new_mask, cpus_allowed)) {
/*
* We must have raced with a concurrent cpuset
@@ -5719,7 +5719,9 @@ static void move_task_off_dead_cpu(int d
/* No more Mr. Nice Guy. */
if (dest_cpu >= nr_cpu_ids) {
- cpumask_t cpus_allowed = cpuset_cpus_allowed_locked(p);
+ cpumask_t cpus_allowed;
+
+ cpuset_cpus_allowed_locked(p, &cpus_allowed);
/*
* Try to stay on the same cpuset, where the
* current cpuset may be a subset of all cpus.
--- linux-2.6.x86.orig/mm/pdflush.c
+++ linux-2.6.x86/mm/pdflush.c
@@ -187,8 +187,8 @@ static int pdflush(void *dummy)
* This is needed as pdflush's are dynamically created and destroyed.
* The boottime pdflush's are easily placed w/o these 2 lines.
*/
- cpus_allowed = cpuset_cpus_allowed(current);
- set_cpus_allowed(current, cpus_allowed);
+ cpuset_cpus_allowed(current, &cpus_allowed);
+ set_cpus_allowed_ptr(current, &cpus_allowed);
return __pdflush(&my_work);
}
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 08/12] generic: reduce stack pressure in sched_affinity
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (6 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 07/12] cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 09/12] numa: move large array from stack to _initdata section Mike Travis
` (3 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel,
Paul Jackson, Cliff Wickman
[-- Attachment #1: cpumask_affinity --]
[-- Type: text/plain, Size: 6158 bytes --]
* Modify sched_affinity functions to pass cpumask_t variables by reference
instead of by value.
* Use new set_cpus_allowed_ptr function.
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Jackson <pj@sgi.com>
Cc: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/cpu/mcheck/mce_amd_64.c | 46 ++++++++++++++++----------------
include/linux/sched.h | 2 -
kernel/compat.c | 2 -
kernel/rcupreempt.c | 4 +-
kernel/sched.c | 5 ++-
5 files changed, 30 insertions(+), 29 deletions(-)
--- linux-2.6.x86.orig/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/mcheck/mce_amd_64.c
@@ -251,18 +251,18 @@ struct threshold_attr {
ssize_t(*store) (struct threshold_block *, const char *, size_t count);
};
-static cpumask_t affinity_set(unsigned int cpu)
+static void affinity_set(unsigned int cpu, cpumask_t *oldmask,
+ cpumask_t *newmask)
{
- cpumask_t oldmask = current->cpus_allowed;
- cpumask_t newmask = CPU_MASK_NONE;
- cpu_set(cpu, newmask);
- set_cpus_allowed(current, newmask);
- return oldmask;
+ *oldmask = current->cpus_allowed;
+ cpus_clear(*newmask);
+ cpu_set(cpu, *newmask);
+ set_cpus_allowed_ptr(current, newmask);
}
-static void affinity_restore(cpumask_t oldmask)
+static void affinity_restore(const cpumask_t *oldmask)
{
- set_cpus_allowed(current, oldmask);
+ set_cpus_allowed_ptr(current, oldmask);
}
#define SHOW_FIELDS(name) \
@@ -277,15 +277,15 @@ static ssize_t store_interrupt_enable(st
const char *buf, size_t count)
{
char *end;
- cpumask_t oldmask;
+ cpumask_t oldmask, newmask;
unsigned long new = simple_strtoul(buf, &end, 0);
if (end == buf)
return -EINVAL;
b->interrupt_enable = !!new;
- oldmask = affinity_set(b->cpu);
+ affinity_set(b->cpu, &oldmask, &newmask);
threshold_restart_bank(b, 0, 0);
- affinity_restore(oldmask);
+ affinity_restore(&oldmask);
return end - buf;
}
@@ -294,7 +294,7 @@ static ssize_t store_threshold_limit(str
const char *buf, size_t count)
{
char *end;
- cpumask_t oldmask;
+ cpumask_t oldmask, newmask;
u16 old;
unsigned long new = simple_strtoul(buf, &end, 0);
if (end == buf)
@@ -306,9 +306,9 @@ static ssize_t store_threshold_limit(str
old = b->threshold_limit;
b->threshold_limit = new;
- oldmask = affinity_set(b->cpu);
+ affinity_set(b->cpu, &oldmask, &newmask);
threshold_restart_bank(b, 0, old);
- affinity_restore(oldmask);
+ affinity_restore(&oldmask);
return end - buf;
}
@@ -316,10 +316,10 @@ static ssize_t store_threshold_limit(str
static ssize_t show_error_count(struct threshold_block *b, char *buf)
{
u32 high, low;
- cpumask_t oldmask;
- oldmask = affinity_set(b->cpu);
+ cpumask_t oldmask, newmask;
+ affinity_set(b->cpu, &oldmask, &newmask);
rdmsr(b->address, low, high);
- affinity_restore(oldmask);
+ affinity_restore(&oldmask);
return sprintf(buf, "%x\n",
(high & 0xFFF) - (THRESHOLD_MAX - b->threshold_limit));
}
@@ -327,10 +327,10 @@ static ssize_t show_error_count(struct t
static ssize_t store_error_count(struct threshold_block *b,
const char *buf, size_t count)
{
- cpumask_t oldmask;
- oldmask = affinity_set(b->cpu);
+ cpumask_t oldmask, newmask;
+ affinity_set(b->cpu, &oldmask, &newmask);
threshold_restart_bank(b, 1, 0);
- affinity_restore(oldmask);
+ affinity_restore(&oldmask);
return 1;
}
@@ -468,7 +468,7 @@ static __cpuinit int threshold_create_ba
{
int i, err = 0;
struct threshold_bank *b = NULL;
- cpumask_t oldmask = CPU_MASK_NONE;
+ cpumask_t oldmask, newmask;
char name[32];
sprintf(name, "threshold_bank%i", bank);
@@ -519,10 +519,10 @@ static __cpuinit int threshold_create_ba
per_cpu(threshold_banks, cpu)[bank] = b;
- oldmask = affinity_set(cpu);
+ affinity_set(cpu, &oldmask, &newmask);
err = allocate_threshold_blocks(cpu, bank, 0,
MSR_IA32_MC0_MISC + bank * 4);
- affinity_restore(oldmask);
+ affinity_restore(&oldmask);
if (err)
goto out_free;
--- linux-2.6.x86.orig/include/linux/sched.h
+++ linux-2.6.x86/include/linux/sched.h
@@ -2086,7 +2086,7 @@ ftrace_special(unsigned long arg1, unsig
}
#endif
-extern long sched_setaffinity(pid_t pid, cpumask_t new_mask);
+extern long sched_setaffinity(pid_t pid, const cpumask_t *new_mask);
extern long sched_getaffinity(pid_t pid, cpumask_t *mask);
extern int sched_mc_power_savings, sched_smt_power_savings;
--- linux-2.6.x86.orig/kernel/compat.c
+++ linux-2.6.x86/kernel/compat.c
@@ -446,7 +446,7 @@ asmlinkage long compat_sys_sched_setaffi
if (retval)
return retval;
- return sched_setaffinity(pid, new_mask);
+ return sched_setaffinity(pid, &new_mask);
}
asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
--- linux-2.6.x86.orig/kernel/rcupreempt.c
+++ linux-2.6.x86/kernel/rcupreempt.c
@@ -1005,10 +1005,10 @@ void __synchronize_sched(void)
if (sched_getaffinity(0, &oldmask) < 0)
oldmask = cpu_possible_map;
for_each_online_cpu(cpu) {
- sched_setaffinity(0, cpumask_of_cpu(cpu));
+ sched_setaffinity(0, &cpumask_of_cpu(cpu));
schedule();
}
- sched_setaffinity(0, oldmask);
+ sched_setaffinity(0, &oldmask);
}
EXPORT_SYMBOL_GPL(__synchronize_sched);
--- linux-2.6.x86.orig/kernel/sched.c
+++ linux-2.6.x86/kernel/sched.c
@@ -4963,9 +4963,10 @@ out_unlock:
return retval;
}
-long sched_setaffinity(pid_t pid, cpumask_t new_mask)
+long sched_setaffinity(pid_t pid, const cpumask_t *in_mask)
{
cpumask_t cpus_allowed;
+ cpumask_t new_mask = *in_mask;
struct task_struct *p;
int retval;
@@ -5046,7 +5047,7 @@ asmlinkage long sys_sched_setaffinity(pi
if (retval)
return retval;
- return sched_setaffinity(pid, new_mask);
+ return sched_setaffinity(pid, &new_mask);
}
/*
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 09/12] numa: move large array from stack to _initdata section
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (7 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 08/12] generic: reduce stack pressure in sched_affinity Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 10/12] nodemask: use new node_to_cpumask_ptr function Mike Travis
` (2 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: numa_initmem_init --]
[-- Type: text/plain, Size: 1153 bytes --]
* Move large array "struct bootnode nodes" from stack to _initdata
section to reduce amount of stack space required.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/mm/numa_64.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- linux-2.6.x86.orig/arch/x86/mm/numa_64.c
+++ linux-2.6.x86/arch/x86/mm/numa_64.c
@@ -411,9 +411,10 @@ static int __init split_nodes_by_size(st
* Sets up the system RAM area from start_pfn to end_pfn according to the
* numa=fake command-line option.
*/
+static struct bootnode nodes[MAX_NUMNODES] __initdata;
+
static int __init numa_emulation(unsigned long start_pfn, unsigned long end_pfn)
{
- struct bootnode nodes[MAX_NUMNODES];
u64 size, addr = start_pfn << PAGE_SHIFT;
u64 max_addr = end_pfn << PAGE_SHIFT;
int num_nodes = 0, num = 0, coeff_flag, coeff = -1, i;
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 10/12] nodemask: use new node_to_cpumask_ptr function
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (8 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 09/12] numa: move large array from stack to _initdata section Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 11/12] cpumask: reduce stack usage in SD_x_INIT initializers Mike Travis
2008-04-05 1:11 ` [PATCH 12/12] cpumask: Cleanup more uses of CPU_MASK and NODE_MASK Mike Travis
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel,
Greg Kroah-Hartman, Greg Banks
[-- Attachment #1: node_to_cpumask_ptr --]
[-- Type: text/plain, Size: 8568 bytes --]
* Use new node_to_cpumask_ptr. This creates a pointer to the
cpumask for a given node. This definition is in mm patch:
asm-generic-add-node_to_cpumask_ptr-macro.patch
* Use new set_cpus_allowed_ptr function.
Depends on:
[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
[sched-devel]: sched: add new set_cpus_allowed_ptr function
[x86/latest]: x86: add cpus_scnprintf function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
# pci
Cc: Greg Kroah-Hartman <gregkh@suse.de>
# sunrpc
Cc: Greg Banks <gnb@melbourne.sgi.com>
# x86
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
One checkpatch error that I don't think can be fixed (was already in source):
ERROR: Macros with complex values should be enclosed in parenthesis
#230: FILE: include/linux/topology.h:49:
#define for_each_node_with_cpus(node) \
for_each_online_node(node) \
if (nr_cpus_node(node))
---
drivers/base/node.c | 7 ++++---
kernel/sched.c | 29 ++++++++++++++---------------
mm/page_alloc.c | 6 +++---
mm/slab.c | 5 ++---
mm/vmscan.c | 18 ++++++++----------
net/sunrpc/svc.c | 16 +++++++++++-----
6 files changed, 42 insertions(+), 39 deletions(-)
--- linux-2.6.x86.orig/drivers/base/node.c
+++ linux-2.6.x86/drivers/base/node.c
@@ -22,14 +22,15 @@ static struct sysdev_class node_class =
static ssize_t node_read_cpumap(struct sys_device * dev, char * buf)
{
struct node *node_dev = to_node(dev);
- cpumask_t mask = node_to_cpumask(node_dev->sysdev.id);
+ node_to_cpumask_ptr(mask, node_dev->sysdev.id);
int len;
/* 2004/06/03: buf currently PAGE_SIZE, need > 1 char per 4 bits. */
BUILD_BUG_ON(MAX_NUMNODES/4 > PAGE_SIZE/2);
- len = cpumask_scnprintf(buf, PAGE_SIZE-1, mask);
- len += sprintf(buf + len, "\n");
+ len = cpumask_scnprintf(buf, PAGE_SIZE-2, *mask);
+ buf[len++] = '\n';
+ buf[len] = '\0';
return len;
}
--- linux-2.6.x86.orig/kernel/sched.c
+++ linux-2.6.x86/kernel/sched.c
@@ -6484,7 +6484,7 @@ init_sched_build_groups(cpumask_t span,
*
* Should use nodemask_t.
*/
-static int find_next_best_node(int node, unsigned long *used_nodes)
+static int find_next_best_node(int node, nodemask_t *used_nodes)
{
int i, n, val, min_val, best_node = 0;
@@ -6498,7 +6498,7 @@ static int find_next_best_node(int node,
continue;
/* Skip already used nodes */
- if (test_bit(n, used_nodes))
+ if (node_isset(n, *used_nodes))
continue;
/* Simple min distance search */
@@ -6510,14 +6510,13 @@ static int find_next_best_node(int node,
}
}
- set_bit(best_node, used_nodes);
+ node_set(best_node, *used_nodes);
return best_node;
}
/**
* sched_domain_node_span - get a cpumask for a node's sched_domain
* @node: node whose cpumask we're constructing
- * @size: number of nodes to include in this span
*
* Given a node, construct a good cpumask for its sched_domain to span. It
* should be one that prevents unnecessary balancing, but also spreads tasks
@@ -6525,22 +6524,22 @@ static int find_next_best_node(int node,
*/
static cpumask_t sched_domain_node_span(int node)
{
- DECLARE_BITMAP(used_nodes, MAX_NUMNODES);
- cpumask_t span, nodemask;
+ nodemask_t used_nodes;
+ cpumask_t span;
+ node_to_cpumask_ptr(nodemask, node);
int i;
cpus_clear(span);
- bitmap_zero(used_nodes, MAX_NUMNODES);
+ nodes_clear(used_nodes);
- nodemask = node_to_cpumask(node);
- cpus_or(span, span, nodemask);
- set_bit(node, used_nodes);
+ cpus_or(span, span, *nodemask);
+ node_set(node, used_nodes);
for (i = 1; i < SD_NODES_PER_DOMAIN; i++) {
- int next_node = find_next_best_node(node, used_nodes);
+ int next_node = find_next_best_node(node, &used_nodes);
- nodemask = node_to_cpumask(next_node);
- cpus_or(span, span, nodemask);
+ node_to_cpumask_ptr_next(nodemask, next_node);
+ cpus_or(span, span, *nodemask);
}
return span;
@@ -6937,6 +6936,7 @@ static int build_sched_domains(const cpu
for (j = 0; j < MAX_NUMNODES; j++) {
cpumask_t tmp, notcovered;
int n = (i + j) % MAX_NUMNODES;
+ node_to_cpumask_ptr(pnodemask, n);
cpus_complement(notcovered, covered);
cpus_and(tmp, notcovered, *cpu_map);
@@ -6944,8 +6944,7 @@ static int build_sched_domains(const cpu
if (cpus_empty(tmp))
break;
- nodemask = node_to_cpumask(n);
- cpus_and(tmp, tmp, nodemask);
+ cpus_and(tmp, tmp, *pnodemask);
if (cpus_empty(tmp))
continue;
--- linux-2.6.x86.orig/mm/page_alloc.c
+++ linux-2.6.x86/mm/page_alloc.c
@@ -2029,6 +2029,7 @@ static int find_next_best_node(int node,
int n, val;
int min_val = INT_MAX;
int best_node = -1;
+ node_to_cpumask_ptr(tmp, 0);
/* Use the local node if we haven't already */
if (!node_isset(node, *used_node_mask)) {
@@ -2037,7 +2038,6 @@ static int find_next_best_node(int node,
}
for_each_node_state(n, N_HIGH_MEMORY) {
- cpumask_t tmp;
/* Don't want a node to appear more than once */
if (node_isset(n, *used_node_mask))
@@ -2050,8 +2050,8 @@ static int find_next_best_node(int node,
val += (n < node);
/* Give preference to headless and unused nodes */
- tmp = node_to_cpumask(n);
- if (!cpus_empty(tmp))
+ node_to_cpumask_ptr_next(tmp, n);
+ if (!cpus_empty(*tmp))
val += PENALTY_FOR_NODE_WITH_CPUS;
/* Slight preference for less loaded node */
--- linux-2.6.x86.orig/mm/slab.c
+++ linux-2.6.x86/mm/slab.c
@@ -1160,14 +1160,13 @@ static void __cpuinit cpuup_canceled(lon
struct kmem_cache *cachep;
struct kmem_list3 *l3 = NULL;
int node = cpu_to_node(cpu);
+ node_to_cpumask_ptr(mask, node);
list_for_each_entry(cachep, &cache_chain, next) {
struct array_cache *nc;
struct array_cache *shared;
struct array_cache **alien;
- cpumask_t mask;
- mask = node_to_cpumask(node);
/* cpu is dead; no one can alloc from it. */
nc = cachep->array[cpu];
cachep->array[cpu] = NULL;
@@ -1183,7 +1182,7 @@ static void __cpuinit cpuup_canceled(lon
if (nc)
free_block(cachep, nc->entry, nc->avail, node);
- if (!cpus_empty(mask)) {
+ if (!cpus_empty(*mask)) {
spin_unlock_irq(&l3->list_lock);
goto free_array_cache;
}
--- linux-2.6.x86.orig/mm/vmscan.c
+++ linux-2.6.x86/mm/vmscan.c
@@ -1647,11 +1647,10 @@ static int kswapd(void *p)
struct reclaim_state reclaim_state = {
.reclaimed_slab = 0,
};
- cpumask_t cpumask;
+ node_to_cpumask_ptr(cpumask, pgdat->node_id);
- cpumask = node_to_cpumask(pgdat->node_id);
- if (!cpus_empty(cpumask))
- set_cpus_allowed(tsk, cpumask);
+ if (!cpus_empty(*cpumask))
+ set_cpus_allowed_ptr(tsk, cpumask);
current->reclaim_state = &reclaim_state;
/*
@@ -1880,17 +1879,16 @@ out:
static int __devinit cpu_callback(struct notifier_block *nfb,
unsigned long action, void *hcpu)
{
- pg_data_t *pgdat;
- cpumask_t mask;
int nid;
if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
for_each_node_state(nid, N_HIGH_MEMORY) {
- pgdat = NODE_DATA(nid);
- mask = node_to_cpumask(pgdat->node_id);
- if (any_online_cpu(mask) != NR_CPUS)
+ pg_data_t *pgdat = NODE_DATA(nid);
+ node_to_cpumask_ptr(mask, pgdat->node_id);
+
+ if (any_online_cpu(*mask) < nr_cpu_ids)
/* One of our CPUs online: restore mask */
- set_cpus_allowed(pgdat->kswapd, mask);
+ set_cpus_allowed_ptr(pgdat->kswapd, mask);
}
}
return NOTIFY_OK;
--- linux-2.6.x86.orig/net/sunrpc/svc.c
+++ linux-2.6.x86/net/sunrpc/svc.c
@@ -301,7 +301,6 @@ static inline int
svc_pool_map_set_cpumask(unsigned int pidx, cpumask_t *oldmask)
{
struct svc_pool_map *m = &svc_pool_map;
- unsigned int node; /* or cpu */
/*
* The caller checks for sv_nrpools > 1, which
@@ -314,16 +313,23 @@ svc_pool_map_set_cpumask(unsigned int pi
default:
return 0;
case SVC_POOL_PERCPU:
- node = m->pool_to[pidx];
+ {
+ unsigned int cpu = m->pool_to[pidx];
+
*oldmask = current->cpus_allowed;
- set_cpus_allowed(current, cpumask_of_cpu(node));
+ set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu));
return 1;
+ }
case SVC_POOL_PERNODE:
- node = m->pool_to[pidx];
+ {
+ unsigned int node = m->pool_to[pidx];
+ node_to_cpumask_ptr(nodecpumask, node);
+
*oldmask = current->cpus_allowed;
- set_cpus_allowed(current, node_to_cpumask(node));
+ set_cpus_allowed_ptr(current, nodecpumask);
return 1;
}
+ }
}
/*
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 11/12] cpumask: reduce stack usage in SD_x_INIT initializers
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (9 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 10/12] nodemask: use new node_to_cpumask_ptr function Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
2008-04-05 1:11 ` [PATCH 12/12] cpumask: Cleanup more uses of CPU_MASK and NODE_MASK Mike Travis
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: kernel_sched_c --]
[-- Type: text/plain, Size: 30788 bytes --]
* Remove empty cpumask_t (and all non-zero/non-null) variables
in SD_*_INIT macros. Use memset(0) to clear. Also, don't
inline the initializer functions to save on stack space in
build_sched_domains().
* Merge change to include/linux/topology.h that uses the new
node_to_cpumask_ptr function in the nr_cpus_node macro into
this patch.
Depends on:
[mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
include/asm-x86/topology.h | 5
include/linux/topology.h | 46 -----
kernel/sched.c | 368 ++++++++++++++++++++++++++++++---------------
3 files changed, 256 insertions(+), 163 deletions(-)
--- linux-2.6.x86.orig/include/asm-x86/topology.h
+++ linux-2.6.x86/include/asm-x86/topology.h
@@ -155,10 +155,6 @@ extern unsigned long node_remap_size[];
/* sched_domains SD_NODE_INIT for NUMAQ machines */
#define SD_NODE_INIT (struct sched_domain) { \
- .span = CPU_MASK_NONE, \
- .parent = NULL, \
- .child = NULL, \
- .groups = NULL, \
.min_interval = 8, \
.max_interval = 32, \
.busy_factor = 32, \
@@ -176,7 +172,6 @@ extern unsigned long node_remap_size[];
| SD_WAKE_BALANCE, \
.last_balance = jiffies, \
.balance_interval = 1, \
- .nr_balance_failed = 0, \
}
#ifdef CONFIG_X86_64_ACPI_NUMA
--- linux-2.6.x86.orig/include/linux/topology.h
+++ linux-2.6.x86/include/linux/topology.h
@@ -38,16 +38,15 @@
#endif
#ifndef nr_cpus_node
-#define nr_cpus_node(node) \
- ({ \
- cpumask_t __tmp__; \
- __tmp__ = node_to_cpumask(node); \
- cpus_weight(__tmp__); \
+#define nr_cpus_node(node) \
+ ({ \
+ node_to_cpumask_ptr(__tmp__, node); \
+ cpus_weight(*__tmp__); \
})
#endif
-#define for_each_node_with_cpus(node) \
- for_each_online_node(node) \
+#define for_each_node_with_cpus(node) \
+ for_each_online_node(node) \
if (nr_cpus_node(node))
void arch_update_cpu_topology(void);
@@ -80,7 +79,9 @@ void arch_update_cpu_topology(void);
* by defining their own arch-specific initializer in include/asm/topology.h.
* A definition there will automagically override these default initializers
* and allow arch-specific performance tuning of sched_domains.
+ * (Only non-zero and non-null fields need be specified.)
*/
+
#ifdef CONFIG_SCHED_SMT
/* MCD - Do we really need this? It is always on if CONFIG_SCHED_SMT is,
* so can't we drop this in favor of CONFIG_SCHED_SMT?
@@ -89,20 +90,10 @@ void arch_update_cpu_topology(void);
/* Common values for SMT siblings */
#ifndef SD_SIBLING_INIT
#define SD_SIBLING_INIT (struct sched_domain) { \
- .span = CPU_MASK_NONE, \
- .parent = NULL, \
- .child = NULL, \
- .groups = NULL, \
.min_interval = 1, \
.max_interval = 2, \
.busy_factor = 64, \
.imbalance_pct = 110, \
- .cache_nice_tries = 0, \
- .busy_idx = 0, \
- .idle_idx = 0, \
- .newidle_idx = 0, \
- .wake_idx = 0, \
- .forkexec_idx = 0, \
.flags = SD_LOAD_BALANCE \
| SD_BALANCE_NEWIDLE \
| SD_BALANCE_FORK \
@@ -112,7 +103,6 @@ void arch_update_cpu_topology(void);
| SD_SHARE_CPUPOWER, \
.last_balance = jiffies, \
.balance_interval = 1, \
- .nr_balance_failed = 0, \
}
#endif
#endif /* CONFIG_SCHED_SMT */
@@ -121,18 +111,12 @@ void arch_update_cpu_topology(void);
/* Common values for MC siblings. for now mostly derived from SD_CPU_INIT */
#ifndef SD_MC_INIT
#define SD_MC_INIT (struct sched_domain) { \
- .span = CPU_MASK_NONE, \
- .parent = NULL, \
- .child = NULL, \
- .groups = NULL, \
.min_interval = 1, \
.max_interval = 4, \
.busy_factor = 64, \
.imbalance_pct = 125, \
.cache_nice_tries = 1, \
.busy_idx = 2, \
- .idle_idx = 0, \
- .newidle_idx = 0, \
.wake_idx = 1, \
.forkexec_idx = 1, \
.flags = SD_LOAD_BALANCE \
@@ -144,7 +128,6 @@ void arch_update_cpu_topology(void);
| BALANCE_FOR_MC_POWER, \
.last_balance = jiffies, \
.balance_interval = 1, \
- .nr_balance_failed = 0, \
}
#endif
#endif /* CONFIG_SCHED_MC */
@@ -152,10 +135,6 @@ void arch_update_cpu_topology(void);
/* Common values for CPUs */
#ifndef SD_CPU_INIT
#define SD_CPU_INIT (struct sched_domain) { \
- .span = CPU_MASK_NONE, \
- .parent = NULL, \
- .child = NULL, \
- .groups = NULL, \
.min_interval = 1, \
.max_interval = 4, \
.busy_factor = 64, \
@@ -174,16 +153,11 @@ void arch_update_cpu_topology(void);
| BALANCE_FOR_PKG_POWER,\
.last_balance = jiffies, \
.balance_interval = 1, \
- .nr_balance_failed = 0, \
}
#endif
/* sched_domains SD_ALLNODES_INIT for NUMA machines */
#define SD_ALLNODES_INIT (struct sched_domain) { \
- .span = CPU_MASK_NONE, \
- .parent = NULL, \
- .child = NULL, \
- .groups = NULL, \
.min_interval = 64, \
.max_interval = 64*num_online_cpus(), \
.busy_factor = 128, \
@@ -191,14 +165,10 @@ void arch_update_cpu_topology(void);
.cache_nice_tries = 1, \
.busy_idx = 3, \
.idle_idx = 3, \
- .newidle_idx = 0, /* unused */ \
- .wake_idx = 0, /* unused */ \
- .forkexec_idx = 0, /* unused */ \
.flags = SD_LOAD_BALANCE \
| SD_SERIALIZE, \
.last_balance = jiffies, \
.balance_interval = 64, \
- .nr_balance_failed = 0, \
}
#ifdef CONFIG_NUMA
--- linux-2.6.x86.orig/kernel/sched.c
+++ linux-2.6.x86/kernel/sched.c
@@ -1900,17 +1900,17 @@ find_idlest_group(struct sched_domain *s
* find_idlest_cpu - find the idlest cpu among the cpus in group.
*/
static int
-find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu,
+ cpumask_t *tmp)
{
- cpumask_t tmp;
unsigned long load, min_load = ULONG_MAX;
int idlest = -1;
int i;
/* Traverse only the allowed CPUs */
- cpus_and(tmp, group->cpumask, p->cpus_allowed);
+ cpus_and(*tmp, group->cpumask, p->cpus_allowed);
- for_each_cpu_mask(i, tmp) {
+ for_each_cpu_mask(i, *tmp) {
load = weighted_cpuload(i);
if (load < min_load || (load == min_load && i == this_cpu)) {
@@ -1949,7 +1949,7 @@ static int sched_balance_self(int cpu, i
}
while (sd) {
- cpumask_t span;
+ cpumask_t span, tmpmask;
struct sched_group *group;
int new_cpu, weight;
@@ -1965,7 +1965,7 @@ static int sched_balance_self(int cpu, i
continue;
}
- new_cpu = find_idlest_cpu(group, t, cpu);
+ new_cpu = find_idlest_cpu(group, t, cpu, &tmpmask);
if (new_cpu == -1 || new_cpu == cpu) {
/* Now try balancing at a lower domain level of cpu */
sd = sd->child;
@@ -2852,7 +2852,7 @@ static int move_one_task(struct rq *this
static struct sched_group *
find_busiest_group(struct sched_domain *sd, int this_cpu,
unsigned long *imbalance, enum cpu_idle_type idle,
- int *sd_idle, cpumask_t *cpus, int *balance)
+ int *sd_idle, const cpumask_t *cpus, int *balance)
{
struct sched_group *busiest = NULL, *this = NULL, *group = sd->groups;
unsigned long max_load, avg_load, total_load, this_load, total_pwr;
@@ -3153,7 +3153,7 @@ ret:
*/
static struct rq *
find_busiest_queue(struct sched_group *group, enum cpu_idle_type idle,
- unsigned long imbalance, cpumask_t *cpus)
+ unsigned long imbalance, const cpumask_t *cpus)
{
struct rq *busiest = NULL, *rq;
unsigned long max_load = 0;
@@ -3192,15 +3192,16 @@ find_busiest_queue(struct sched_group *g
*/
static int load_balance(int this_cpu, struct rq *this_rq,
struct sched_domain *sd, enum cpu_idle_type idle,
- int *balance)
+ int *balance, cpumask_t *cpus)
{
int ld_moved, all_pinned = 0, active_balance = 0, sd_idle = 0;
struct sched_group *group;
unsigned long imbalance;
struct rq *busiest;
- cpumask_t cpus = CPU_MASK_ALL;
unsigned long flags;
+ cpus_setall(*cpus);
+
/*
* When power savings policy is enabled for the parent domain, idle
* sibling can pick up load irrespective of busy siblings. In this case,
@@ -3215,7 +3216,7 @@ static int load_balance(int this_cpu, st
redo:
group = find_busiest_group(sd, this_cpu, &imbalance, idle, &sd_idle,
- &cpus, balance);
+ cpus, balance);
if (*balance == 0)
goto out_balanced;
@@ -3225,7 +3226,7 @@ redo:
goto out_balanced;
}
- busiest = find_busiest_queue(group, idle, imbalance, &cpus);
+ busiest = find_busiest_queue(group, idle, imbalance, cpus);
if (!busiest) {
schedstat_inc(sd, lb_nobusyq[idle]);
goto out_balanced;
@@ -3258,8 +3259,8 @@ redo:
/* All tasks on this runqueue were pinned by CPU affinity */
if (unlikely(all_pinned)) {
- cpu_clear(cpu_of(busiest), cpus);
- if (!cpus_empty(cpus))
+ cpu_clear(cpu_of(busiest), *cpus);
+ if (!cpus_empty(*cpus))
goto redo;
goto out_balanced;
}
@@ -3344,7 +3345,8 @@ out_one_pinned:
* this_rq is locked.
*/
static int
-load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd)
+load_balance_newidle(int this_cpu, struct rq *this_rq, struct sched_domain *sd,
+ cpumask_t *cpus)
{
struct sched_group *group;
struct rq *busiest = NULL;
@@ -3352,7 +3354,8 @@ load_balance_newidle(int this_cpu, struc
int ld_moved = 0;
int sd_idle = 0;
int all_pinned = 0;
- cpumask_t cpus = CPU_MASK_ALL;
+
+ cpus_setall(*cpus);
/*
* When power savings policy is enabled for the parent domain, idle
@@ -3367,14 +3370,13 @@ load_balance_newidle(int this_cpu, struc
schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
redo:
group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
- &sd_idle, &cpus, NULL);
+ &sd_idle, cpus, NULL);
if (!group) {
schedstat_inc(sd, lb_nobusyg[CPU_NEWLY_IDLE]);
goto out_balanced;
}
- busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance,
- &cpus);
+ busiest = find_busiest_queue(group, CPU_NEWLY_IDLE, imbalance, cpus);
if (!busiest) {
schedstat_inc(sd, lb_nobusyq[CPU_NEWLY_IDLE]);
goto out_balanced;
@@ -3396,8 +3398,8 @@ redo:
spin_unlock(&busiest->lock);
if (unlikely(all_pinned)) {
- cpu_clear(cpu_of(busiest), cpus);
- if (!cpus_empty(cpus))
+ cpu_clear(cpu_of(busiest), *cpus);
+ if (!cpus_empty(*cpus))
goto redo;
}
}
@@ -3431,6 +3433,7 @@ static void idle_balance(int this_cpu, s
struct sched_domain *sd;
int pulled_task = -1;
unsigned long next_balance = jiffies + HZ;
+ cpumask_t tmpmask;
for_each_domain(this_cpu, sd) {
unsigned long interval;
@@ -3440,8 +3443,8 @@ static void idle_balance(int this_cpu, s
if (sd->flags & SD_BALANCE_NEWIDLE)
/* If we've pulled tasks over stop searching: */
- pulled_task = load_balance_newidle(this_cpu,
- this_rq, sd);
+ pulled_task = load_balance_newidle(this_cpu, this_rq,
+ sd, &tmpmask);
interval = msecs_to_jiffies(sd->balance_interval);
if (time_after(next_balance, sd->last_balance + interval))
@@ -3600,6 +3603,7 @@ static void rebalance_domains(int cpu, e
/* Earliest time when we have to do rebalance again */
unsigned long next_balance = jiffies + 60*HZ;
int update_next_balance = 0;
+ cpumask_t tmp;
for_each_domain(cpu, sd) {
if (!(sd->flags & SD_LOAD_BALANCE))
@@ -3623,7 +3627,7 @@ static void rebalance_domains(int cpu, e
}
if (time_after_eq(jiffies, sd->last_balance + interval)) {
- if (load_balance(cpu, rq, sd, idle, &balance)) {
+ if (load_balance(cpu, rq, sd, idle, &balance, &tmp)) {
/*
* We've pulled tasks over so either we're no
* longer idle, or one of our SMT siblings is
@@ -5000,7 +5004,7 @@ long sched_setaffinity(pid_t pid, const
cpuset_cpus_allowed(p, &cpus_allowed);
cpus_and(new_mask, new_mask, cpus_allowed);
again:
- retval = set_cpus_allowed(p, new_mask);
+ retval = set_cpus_allowed_ptr(p, &new_mask);
if (!retval) {
cpuset_cpus_allowed(p, &cpus_allowed);
@@ -5758,7 +5762,7 @@ static void move_task_off_dead_cpu(int d
*/
static void migrate_nr_uninterruptible(struct rq *rq_src)
{
- struct rq *rq_dest = cpu_rq(any_online_cpu(CPU_MASK_ALL));
+ struct rq *rq_dest = cpu_rq(any_online_cpu(*CPU_MASK_ALL_PTR));
unsigned long flags;
local_irq_save(flags);
@@ -6172,14 +6176,14 @@ void __init migration_init(void)
#ifdef CONFIG_SCHED_DEBUG
-static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
+static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
+ cpumask_t *groupmask)
{
struct sched_group *group = sd->groups;
- cpumask_t groupmask;
char str[256];
cpulist_scnprintf(str, sizeof(str), sd->span);
- cpus_clear(groupmask);
+ cpus_clear(*groupmask);
printk(KERN_DEBUG "%*s domain %d: ", level, "", level);
@@ -6223,13 +6227,13 @@ static int sched_domain_debug_one(struct
break;
}
- if (cpus_intersects(groupmask, group->cpumask)) {
+ if (cpus_intersects(*groupmask, group->cpumask)) {
printk(KERN_CONT "\n");
printk(KERN_ERR "ERROR: repeated CPUs\n");
break;
}
- cpus_or(groupmask, groupmask, group->cpumask);
+ cpus_or(*groupmask, *groupmask, group->cpumask);
cpulist_scnprintf(str, sizeof(str), group->cpumask);
printk(KERN_CONT " %s", str);
@@ -6238,10 +6242,10 @@ static int sched_domain_debug_one(struct
} while (group != sd->groups);
printk(KERN_CONT "\n");
- if (!cpus_equal(sd->span, groupmask))
+ if (!cpus_equal(sd->span, *groupmask))
printk(KERN_ERR "ERROR: groups don't span domain->span\n");
- if (sd->parent && !cpus_subset(groupmask, sd->parent->span))
+ if (sd->parent && !cpus_subset(*groupmask, sd->parent->span))
printk(KERN_ERR "ERROR: parent span is not a superset "
"of domain->span\n");
return 0;
@@ -6249,6 +6253,7 @@ static int sched_domain_debug_one(struct
static void sched_domain_debug(struct sched_domain *sd, int cpu)
{
+ cpumask_t *groupmask;
int level = 0;
if (!sd) {
@@ -6258,14 +6263,21 @@ static void sched_domain_debug(struct sc
printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
+ groupmask = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
+ if (!groupmask) {
+ printk(KERN_DEBUG "Cannot load-balance (out of memory)\n");
+ return;
+ }
+
for (;;) {
- if (sched_domain_debug_one(sd, cpu, level))
+ if (sched_domain_debug_one(sd, cpu, level, groupmask))
break;
level++;
sd = sd->parent;
if (!sd)
break;
}
+ kfree(groupmask);
}
#else
# define sched_domain_debug(sd, cpu) do { } while (0)
@@ -6435,30 +6447,33 @@ cpu_attach_domain(struct sched_domain *s
* and ->cpu_power to 0.
*/
static void
-init_sched_build_groups(cpumask_t span, const cpumask_t *cpu_map,
+init_sched_build_groups(const cpumask_t *span, const cpumask_t *cpu_map,
int (*group_fn)(int cpu, const cpumask_t *cpu_map,
- struct sched_group **sg))
+ struct sched_group **sg,
+ cpumask_t *tmpmask),
+ cpumask_t *covered, cpumask_t *tmpmask)
{
struct sched_group *first = NULL, *last = NULL;
- cpumask_t covered = CPU_MASK_NONE;
int i;
- for_each_cpu_mask(i, span) {
+ cpus_clear(*covered);
+
+ for_each_cpu_mask(i, *span) {
struct sched_group *sg;
- int group = group_fn(i, cpu_map, &sg);
+ int group = group_fn(i, cpu_map, &sg, tmpmask);
int j;
- if (cpu_isset(i, covered))
+ if (cpu_isset(i, *covered))
continue;
- sg->cpumask = CPU_MASK_NONE;
+ cpus_clear(sg->cpumask);
sg->__cpu_power = 0;
- for_each_cpu_mask(j, span) {
- if (group_fn(j, cpu_map, NULL) != group)
+ for_each_cpu_mask(j, *span) {
+ if (group_fn(j, cpu_map, NULL, tmpmask) != group)
continue;
- cpu_set(j, covered);
+ cpu_set(j, *covered);
cpu_set(j, sg->cpumask);
}
if (!first)
@@ -6556,7 +6571,8 @@ static DEFINE_PER_CPU(struct sched_domai
static DEFINE_PER_CPU(struct sched_group, sched_group_cpus);
static int
-cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+ cpumask_t *unused)
{
if (sg)
*sg = &per_cpu(sched_group_cpus, cpu);
@@ -6574,19 +6590,22 @@ static DEFINE_PER_CPU(struct sched_group
#if defined(CONFIG_SCHED_MC) && defined(CONFIG_SCHED_SMT)
static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+ cpumask_t *mask)
{
int group;
- cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
- cpus_and(mask, mask, *cpu_map);
- group = first_cpu(mask);
+
+ *mask = per_cpu(cpu_sibling_map, cpu);
+ cpus_and(*mask, *mask, *cpu_map);
+ group = first_cpu(*mask);
if (sg)
*sg = &per_cpu(sched_group_core, group);
return group;
}
#elif defined(CONFIG_SCHED_MC)
static int
-cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+ cpumask_t *unused)
{
if (sg)
*sg = &per_cpu(sched_group_core, cpu);
@@ -6598,17 +6617,18 @@ static DEFINE_PER_CPU(struct sched_domai
static DEFINE_PER_CPU(struct sched_group, sched_group_phys);
static int
-cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
+ cpumask_t *mask)
{
int group;
#ifdef CONFIG_SCHED_MC
- cpumask_t mask = *cpu_coregroup_map(cpu);
- cpus_and(mask, mask, *cpu_map);
- group = first_cpu(mask);
+ *mask = *cpu_coregroup_map(cpu);
+ cpus_and(*mask, *mask, *cpu_map);
+ group = first_cpu(*mask);
#elif defined(CONFIG_SCHED_SMT)
- cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
- cpus_and(mask, mask, *cpu_map);
- group = first_cpu(mask);
+ *mask = per_cpu(cpu_sibling_map, cpu);
+ cpus_and(*mask, *mask, *cpu_map);
+ group = first_cpu(*mask);
#else
group = cpu;
#endif
@@ -6630,13 +6650,13 @@ static DEFINE_PER_CPU(struct sched_domai
static DEFINE_PER_CPU(struct sched_group, sched_group_allnodes);
static int cpu_to_allnodes_group(int cpu, const cpumask_t *cpu_map,
- struct sched_group **sg)
+ struct sched_group **sg, cpumask_t *nodemask)
{
- cpumask_t nodemask = node_to_cpumask(cpu_to_node(cpu));
int group;
- cpus_and(nodemask, nodemask, *cpu_map);
- group = first_cpu(nodemask);
+ *nodemask = node_to_cpumask(cpu_to_node(cpu));
+ cpus_and(*nodemask, *nodemask, *cpu_map);
+ group = first_cpu(*nodemask);
if (sg)
*sg = &per_cpu(sched_group_allnodes, group);
@@ -6672,7 +6692,7 @@ static void init_numa_sched_groups_power
#ifdef CONFIG_NUMA
/* Free memory allocated for various sched_group structures */
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
{
int cpu, i;
@@ -6684,11 +6704,11 @@ static void free_sched_groups(const cpum
continue;
for (i = 0; i < MAX_NUMNODES; i++) {
- cpumask_t nodemask = node_to_cpumask(i);
struct sched_group *oldsg, *sg = sched_group_nodes[i];
- cpus_and(nodemask, nodemask, *cpu_map);
- if (cpus_empty(nodemask))
+ *nodemask = node_to_cpumask(i);
+ cpus_and(*nodemask, *nodemask, *cpu_map);
+ if (cpus_empty(*nodemask))
continue;
if (sg == NULL)
@@ -6706,7 +6726,7 @@ next_sg:
}
}
#else
-static void free_sched_groups(const cpumask_t *cpu_map)
+static void free_sched_groups(const cpumask_t *cpu_map, cpumask_t *nodemask)
{
}
#endif
@@ -6764,6 +6784,65 @@ static void init_sched_groups_power(int
}
/*
+ * Initializers for schedule domains
+ * Non-inlined to reduce accumulated stack pressure in build_sched_domains()
+ */
+
+#define SD_INIT(sd, type) sd_init_##type(sd)
+#define SD_INIT_FUNC(type) \
+static noinline void sd_init_##type(struct sched_domain *sd) \
+{ \
+ memset(sd, 0, sizeof(*sd)); \
+ *sd = SD_##type##_INIT; \
+}
+
+SD_INIT_FUNC(CPU)
+#ifdef CONFIG_NUMA
+ SD_INIT_FUNC(ALLNODES)
+ SD_INIT_FUNC(NODE)
+#endif
+#ifdef CONFIG_SCHED_SMT
+ SD_INIT_FUNC(SIBLING)
+#endif
+#ifdef CONFIG_SCHED_MC
+ SD_INIT_FUNC(MC)
+#endif
+
+/*
+ * To minimize stack usage kmalloc room for cpumasks and share the
+ * space as the usage in build_sched_domains() dictates. Used only
+ * if the amount of space is significant.
+ */
+struct allmasks {
+ cpumask_t tmpmask; /* make this one first */
+ union {
+ cpumask_t nodemask;
+ cpumask_t this_sibling_map;
+ cpumask_t this_core_map;
+ };
+ cpumask_t send_covered;
+
+#ifdef CONFIG_NUMA
+ cpumask_t domainspan;
+ cpumask_t covered;
+ cpumask_t notcovered;
+#endif
+};
+
+#if NR_CPUS > 128
+#define SCHED_CPUMASK_ALLOC 1
+#define SCHED_CPUMASK_FREE(v) kfree(v)
+#define SCHED_CPUMASK_DECLARE(v) struct allmasks *v
+#else
+#define SCHED_CPUMASK_ALLOC 0
+#define SCHED_CPUMASK_FREE(v)
+#define SCHED_CPUMASK_DECLARE(v) struct allmasks _v, *v = &_v
+#endif
+
+#define SCHED_CPUMASK_VAR(v, a) cpumask_t *v = (cpumask_t *) \
+ ((unsigned long)(a) + offsetof(struct allmasks, v))
+
+/*
* Build sched domains for a given set of cpus and attach the sched domains
* to the individual cpus
*/
@@ -6771,6 +6850,8 @@ static int build_sched_domains(const cpu
{
int i;
struct root_domain *rd;
+ SCHED_CPUMASK_DECLARE(allmasks);
+ cpumask_t *tmpmask;
#ifdef CONFIG_NUMA
struct sched_group **sched_group_nodes = NULL;
int sd_allnodes = 0;
@@ -6784,38 +6865,60 @@ static int build_sched_domains(const cpu
printk(KERN_WARNING "Can not alloc sched group node list\n");
return -ENOMEM;
}
- sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
#endif
rd = alloc_rootdomain();
if (!rd) {
printk(KERN_WARNING "Cannot alloc root domain\n");
+#ifdef CONFIG_NUMA
+ kfree(sched_group_nodes);
+#endif
return -ENOMEM;
}
+#if SCHED_CPUMASK_ALLOC
+ /* get space for all scratch cpumask variables */
+ allmasks = kmalloc(sizeof(*allmasks), GFP_KERNEL);
+ if (!allmasks) {
+ printk(KERN_WARNING "Cannot alloc cpumask array\n");
+ kfree(rd);
+#ifdef CONFIG_NUMA
+ kfree(sched_group_nodes);
+#endif
+ return -ENOMEM;
+ }
+#endif
+ tmpmask = (cpumask_t *)allmasks;
+
+
+#ifdef CONFIG_NUMA
+ sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
+#endif
+
/*
* Set up domains for cpus specified by the cpu_map.
*/
for_each_cpu_mask(i, *cpu_map) {
struct sched_domain *sd = NULL, *p;
- cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
+ SCHED_CPUMASK_VAR(nodemask, allmasks);
- cpus_and(nodemask, nodemask, *cpu_map);
+ *nodemask = node_to_cpumask(cpu_to_node(i));
+ cpus_and(*nodemask, *nodemask, *cpu_map);
#ifdef CONFIG_NUMA
if (cpus_weight(*cpu_map) >
- SD_NODES_PER_DOMAIN*cpus_weight(nodemask)) {
+ SD_NODES_PER_DOMAIN*cpus_weight(*nodemask)) {
sd = &per_cpu(allnodes_domains, i);
- *sd = SD_ALLNODES_INIT;
+ SD_INIT(sd, ALLNODES);
sd->span = *cpu_map;
- cpu_to_allnodes_group(i, cpu_map, &sd->groups);
+ cpu_to_allnodes_group(i, cpu_map, &sd->groups, tmpmask);
p = sd;
sd_allnodes = 1;
} else
p = NULL;
sd = &per_cpu(node_domains, i);
- *sd = SD_NODE_INIT;
+ SD_INIT(sd, NODE);
sd->span = sched_domain_node_span(cpu_to_node(i));
sd->parent = p;
if (p)
@@ -6825,94 +6928,114 @@ static int build_sched_domains(const cpu
p = sd;
sd = &per_cpu(phys_domains, i);
- *sd = SD_CPU_INIT;
- sd->span = nodemask;
+ SD_INIT(sd, CPU);
+ sd->span = *nodemask;
sd->parent = p;
if (p)
p->child = sd;
- cpu_to_phys_group(i, cpu_map, &sd->groups);
+ cpu_to_phys_group(i, cpu_map, &sd->groups, tmpmask);
#ifdef CONFIG_SCHED_MC
p = sd;
sd = &per_cpu(core_domains, i);
- *sd = SD_MC_INIT;
+ SD_INIT(sd, MC);
sd->span = *cpu_coregroup_map(i);
cpus_and(sd->span, sd->span, *cpu_map);
sd->parent = p;
p->child = sd;
- cpu_to_core_group(i, cpu_map, &sd->groups);
+ cpu_to_core_group(i, cpu_map, &sd->groups, tmpmask);
#endif
#ifdef CONFIG_SCHED_SMT
p = sd;
sd = &per_cpu(cpu_domains, i);
- *sd = SD_SIBLING_INIT;
+ SD_INIT(sd, SIBLING);
sd->span = per_cpu(cpu_sibling_map, i);
cpus_and(sd->span, sd->span, *cpu_map);
sd->parent = p;
p->child = sd;
- cpu_to_cpu_group(i, cpu_map, &sd->groups);
+ cpu_to_cpu_group(i, cpu_map, &sd->groups, tmpmask);
#endif
}
#ifdef CONFIG_SCHED_SMT
/* Set up CPU (sibling) groups */
for_each_cpu_mask(i, *cpu_map) {
- cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
- cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
- if (i != first_cpu(this_sibling_map))
+ SCHED_CPUMASK_VAR(this_sibling_map, allmasks);
+ SCHED_CPUMASK_VAR(send_covered, allmasks);
+
+ *this_sibling_map = per_cpu(cpu_sibling_map, i);
+ cpus_and(*this_sibling_map, *this_sibling_map, *cpu_map);
+ if (i != first_cpu(*this_sibling_map))
continue;
init_sched_build_groups(this_sibling_map, cpu_map,
- &cpu_to_cpu_group);
+ &cpu_to_cpu_group,
+ send_covered, tmpmask);
}
#endif
#ifdef CONFIG_SCHED_MC
/* Set up multi-core groups */
for_each_cpu_mask(i, *cpu_map) {
- cpumask_t this_core_map = *cpu_coregroup_map(i);
- cpus_and(this_core_map, this_core_map, *cpu_map);
- if (i != first_cpu(this_core_map))
+ SCHED_CPUMASK_VAR(this_core_map, allmasks);
+ SCHED_CPUMASK_VAR(send_covered, allmasks);
+
+ *this_core_map = *cpu_coregroup_map(i);
+ cpus_and(*this_core_map, *this_core_map, *cpu_map);
+ if (i != first_cpu(*this_core_map))
continue;
+
init_sched_build_groups(this_core_map, cpu_map,
- &cpu_to_core_group);
+ &cpu_to_core_group,
+ send_covered, tmpmask);
}
#endif
/* Set up physical groups */
for (i = 0; i < MAX_NUMNODES; i++) {
- cpumask_t nodemask = node_to_cpumask(i);
+ SCHED_CPUMASK_VAR(nodemask, allmasks);
+ SCHED_CPUMASK_VAR(send_covered, allmasks);
- cpus_and(nodemask, nodemask, *cpu_map);
- if (cpus_empty(nodemask))
+ *nodemask = node_to_cpumask(i);
+ cpus_and(*nodemask, *nodemask, *cpu_map);
+ if (cpus_empty(*nodemask))
continue;
- init_sched_build_groups(nodemask, cpu_map, &cpu_to_phys_group);
+ init_sched_build_groups(nodemask, cpu_map,
+ &cpu_to_phys_group,
+ send_covered, tmpmask);
}
#ifdef CONFIG_NUMA
/* Set up node groups */
- if (sd_allnodes)
- init_sched_build_groups(*cpu_map, cpu_map,
- &cpu_to_allnodes_group);
+ if (sd_allnodes) {
+ SCHED_CPUMASK_VAR(send_covered, allmasks);
+
+ init_sched_build_groups(cpu_map, cpu_map,
+ &cpu_to_allnodes_group,
+ send_covered, tmpmask);
+ }
for (i = 0; i < MAX_NUMNODES; i++) {
/* Set up node groups */
struct sched_group *sg, *prev;
- cpumask_t nodemask = node_to_cpumask(i);
- cpumask_t domainspan;
- cpumask_t covered = CPU_MASK_NONE;
+ SCHED_CPUMASK_VAR(nodemask, allmasks);
+ SCHED_CPUMASK_VAR(domainspan, allmasks);
+ SCHED_CPUMASK_VAR(covered, allmasks);
int j;
- cpus_and(nodemask, nodemask, *cpu_map);
- if (cpus_empty(nodemask)) {
+ *nodemask = node_to_cpumask(i);
+ cpus_clear(*covered);
+
+ cpus_and(*nodemask, *nodemask, *cpu_map);
+ if (cpus_empty(*nodemask)) {
sched_group_nodes[i] = NULL;
continue;
}
- domainspan = sched_domain_node_span(i);
- cpus_and(domainspan, domainspan, *cpu_map);
+ *domainspan = sched_domain_node_span(i);
+ cpus_and(*domainspan, *domainspan, *cpu_map);
sg = kmalloc_node(sizeof(struct sched_group), GFP_KERNEL, i);
if (!sg) {
@@ -6921,31 +7044,31 @@ static int build_sched_domains(const cpu
goto error;
}
sched_group_nodes[i] = sg;
- for_each_cpu_mask(j, nodemask) {
+ for_each_cpu_mask(j, *nodemask) {
struct sched_domain *sd;
sd = &per_cpu(node_domains, j);
sd->groups = sg;
}
sg->__cpu_power = 0;
- sg->cpumask = nodemask;
+ sg->cpumask = *nodemask;
sg->next = sg;
- cpus_or(covered, covered, nodemask);
+ cpus_or(*covered, *covered, *nodemask);
prev = sg;
for (j = 0; j < MAX_NUMNODES; j++) {
- cpumask_t tmp, notcovered;
+ SCHED_CPUMASK_VAR(notcovered, allmasks);
int n = (i + j) % MAX_NUMNODES;
node_to_cpumask_ptr(pnodemask, n);
- cpus_complement(notcovered, covered);
- cpus_and(tmp, notcovered, *cpu_map);
- cpus_and(tmp, tmp, domainspan);
- if (cpus_empty(tmp))
+ cpus_complement(*notcovered, *covered);
+ cpus_and(*tmpmask, *notcovered, *cpu_map);
+ cpus_and(*tmpmask, *tmpmask, *domainspan);
+ if (cpus_empty(*tmpmask))
break;
- cpus_and(tmp, tmp, *pnodemask);
- if (cpus_empty(tmp))
+ cpus_and(*tmpmask, *tmpmask, *pnodemask);
+ if (cpus_empty(*tmpmask))
continue;
sg = kmalloc_node(sizeof(struct sched_group),
@@ -6956,9 +7079,9 @@ static int build_sched_domains(const cpu
goto error;
}
sg->__cpu_power = 0;
- sg->cpumask = tmp;
+ sg->cpumask = *tmpmask;
sg->next = prev->next;
- cpus_or(covered, covered, tmp);
+ cpus_or(*covered, *covered, *tmpmask);
prev->next = sg;
prev = sg;
}
@@ -6994,7 +7117,8 @@ static int build_sched_domains(const cpu
if (sd_allnodes) {
struct sched_group *sg;
- cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg);
+ cpu_to_allnodes_group(first_cpu(*cpu_map), cpu_map, &sg,
+ tmpmask);
init_numa_sched_groups_power(sg);
}
#endif
@@ -7012,11 +7136,13 @@ static int build_sched_domains(const cpu
cpu_attach_domain(sd, rd, i);
}
+ SCHED_CPUMASK_FREE((void *)allmasks);
return 0;
#ifdef CONFIG_NUMA
error:
- free_sched_groups(cpu_map);
+ free_sched_groups(cpu_map, tmpmask);
+ SCHED_CPUMASK_FREE((void *)allmasks);
return -ENOMEM;
#endif
}
@@ -7056,9 +7182,10 @@ static int arch_init_sched_domains(const
return err;
}
-static void arch_destroy_sched_domains(const cpumask_t *cpu_map)
+static void arch_destroy_sched_domains(const cpumask_t *cpu_map,
+ cpumask_t *tmpmask)
{
- free_sched_groups(cpu_map);
+ free_sched_groups(cpu_map, tmpmask);
}
/*
@@ -7067,6 +7194,7 @@ static void arch_destroy_sched_domains(c
*/
static void detach_destroy_domains(const cpumask_t *cpu_map)
{
+ cpumask_t tmpmask;
int i;
unregister_sched_domain_sysctl();
@@ -7074,7 +7202,7 @@ static void detach_destroy_domains(const
for_each_cpu_mask(i, *cpu_map)
cpu_attach_domain(NULL, &def_root_domain, i);
synchronize_sched();
- arch_destroy_sched_domains(cpu_map);
+ arch_destroy_sched_domains(cpu_map, &tmpmask);
}
/*
@@ -7282,7 +7410,7 @@ void __init sched_init_smp(void)
hotcpu_notifier(update_sched_domains, 0);
/* Move init over to a non-isolated CPU */
- if (set_cpus_allowed(current, non_isolated_cpus) < 0)
+ if (set_cpus_allowed_ptr(current, &non_isolated_cpus) < 0)
BUG();
sched_init_granularity();
}
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 12/12] cpumask: Cleanup more uses of CPU_MASK and NODE_MASK
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
` (10 preceding siblings ...)
2008-04-05 1:11 ` [PATCH 11/12] cpumask: reduce stack usage in SD_x_INIT initializers Mike Travis
@ 2008-04-05 1:11 ` Mike Travis
11 siblings, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-05 1:11 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
[-- Attachment #1: use-CPUMASK_ALL_PTR --]
[-- Type: text/plain, Size: 2339 bytes --]
* Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
and MAXNODES counts.
* In some cases, the cpumask variable was initialized but then overwritten
with another value. This is the case for changes like this:
- cpumask_t oldmask = CPU_MASK_ALL;
+ cpumask_t oldmask;
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ x86/latest .../x86/linux-2.6-x86.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/genapic_flat_64.c | 4 +++-
arch/x86/kernel/io_apic_64.c | 2 +-
kernel/irq/chip.c | 2 +-
mm/allocpercpu.c | 3 ++-
4 files changed, 7 insertions(+), 4 deletions(-)
--- linux-2.6.x86.orig/arch/x86/kernel/genapic_flat_64.c
+++ linux-2.6.x86/arch/x86/kernel/genapic_flat_64.c
@@ -138,7 +138,9 @@ static cpumask_t physflat_target_cpus(vo
static cpumask_t physflat_vector_allocation_domain(int cpu)
{
- cpumask_t domain = CPU_MASK_NONE;
+ cpumask_t domain;
+
+ cpus_clear(domain);
cpu_set(cpu, domain);
return domain;
}
--- linux-2.6.x86.orig/arch/x86/kernel/io_apic_64.c
+++ linux-2.6.x86/arch/x86/kernel/io_apic_64.c
@@ -772,7 +772,7 @@ static void __clear_irq_vector(int irq)
per_cpu(vector_irq, cpu)[vector] = -1;
cfg->vector = 0;
- cfg->domain = CPU_MASK_NONE;
+ cpus_clear(cfg->domain);
}
void __setup_vector_irq(int cpu)
--- linux-2.6.x86.orig/kernel/irq/chip.c
+++ linux-2.6.x86/kernel/irq/chip.c
@@ -47,7 +47,7 @@ void dynamic_irq_init(unsigned int irq)
desc->irq_count = 0;
desc->irqs_unhandled = 0;
#ifdef CONFIG_SMP
- desc->affinity = CPU_MASK_ALL;
+ cpus_setall(desc->affinity);
#endif
spin_unlock_irqrestore(&desc->lock, flags);
}
--- linux-2.6.x86.orig/mm/allocpercpu.c
+++ linux-2.6.x86/mm/allocpercpu.c
@@ -82,9 +82,10 @@ EXPORT_SYMBOL_GPL(percpu_populate);
int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
cpumask_t *mask)
{
- cpumask_t populated = CPU_MASK_NONE;
+ cpumask_t populated;
int cpu;
+ cpus_clear(populated);
for_each_cpu_mask(cpu, *mask)
if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
__percpu_depopulate_mask(__pdata, &populated);
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2
2008-04-05 1:11 ` [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2 Mike Travis
@ 2008-04-22 16:16 ` Tony Luck
2008-04-22 16:42 ` Mike Travis
2008-04-22 17:04 ` [PATCH 1/1] sched: remove unnecessary kzalloc in sched_init_smp Mike Travis
0 siblings, 2 replies; 17+ messages in thread
From: Tony Luck @ 2008-04-22 16:16 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
linux-kernel
On Fri, Apr 4, 2008 at 6:11 PM, Mike Travis <travis@sgi.com> wrote:
> @@ -7297,6 +7287,11 @@ void __init sched_init_smp(void)
> #else
> void __init sched_init_smp(void)
> {
> +#if defined(CONFIG_NUMA)
> + sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
> + GFP_KERNEL);
> + BUG_ON(sched_group_nodes_bycpu == NULL);
> +#endif
> sched_init_granularity();
> }
> #endif /* CONFIG_SMP */
This hunk is causing problems with one of my builds (generic,
uniprocessor). Note
that the #else at the start of this hunk is from a #ifdef CONFIG_SMP ... so I'm
wondering why we need #if defined(CONFIG_NUMA) inside uniprocessor code :-)
How can you have NUMA issues with only one cpu!!!
[I'm also wondering why the config that has the compile problem has CONFIG_SMP=n
and CONFIG_NUMA=y ... but that weirdness exposed this silliness, so
perhaps it isn't
all bad]
Error message is:
kernel/sched.c: In function `sched_init_smp':
kernel/sched.c:7994: error: `sched_group_nodes_bycpu' undeclared
(first use in this function)
kernel/sched.c:7994: error: (Each undeclared identifier is reported only once
kernel/sched.c:7994: error: for each function it appears in.)
-Tony
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2
2008-04-22 16:16 ` Tony Luck
@ 2008-04-22 16:42 ` Mike Travis
2008-04-22 17:04 ` [PATCH 1/1] sched: remove unnecessary kzalloc in sched_init_smp Mike Travis
1 sibling, 0 replies; 17+ messages in thread
From: Mike Travis @ 2008-04-22 16:42 UTC (permalink / raw)
To: Tony Luck
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
linux-kernel
Tony Luck wrote:
> On Fri, Apr 4, 2008 at 6:11 PM, Mike Travis <travis@sgi.com> wrote:
>> @@ -7297,6 +7287,11 @@ void __init sched_init_smp(void)
>> #else
>> void __init sched_init_smp(void)
>> {
>> +#if defined(CONFIG_NUMA)
>> + sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
>> + GFP_KERNEL);
>> + BUG_ON(sched_group_nodes_bycpu == NULL);
>> +#endif
>> sched_init_granularity();
>> }
>> #endif /* CONFIG_SMP */
>
> This hunk is causing problems with one of my builds (generic,
> uniprocessor). Note
> that the #else at the start of this hunk is from a #ifdef CONFIG_SMP ... so I'm
> wondering why we need #if defined(CONFIG_NUMA) inside uniprocessor code :-)
> How can you have NUMA issues with only one cpu!!!
>
> [I'm also wondering why the config that has the compile problem has CONFIG_SMP=n
> and CONFIG_NUMA=y ... but that weirdness exposed this silliness, so
> perhaps it isn't
> all bad]
>
> Error message is:
> kernel/sched.c: In function `sched_init_smp':
> kernel/sched.c:7994: error: `sched_group_nodes_bycpu' undeclared
> (first use in this function)
> kernel/sched.c:7994: error: (Each undeclared identifier is reported only once
> kernel/sched.c:7994: error: for each function it appears in.)
>
> -Tony
Hi Tony,
Hmm, yes, good point. I guess you might have it when there's a single
processor under a guest OS that's not on node 0, but I suspect there's
plenty of other problems with that scenario.
We could make CONFIG_NUMA dependent on CONFIG_SMP? Or define
sched_group_nodes_bycpu for the non-SMP case?
[I'll have to go back to that patch and research why I added this.]
Thanks,
Mike
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/1] sched: remove unnecessary kzalloc in sched_init_smp
2008-04-22 16:16 ` Tony Luck
2008-04-22 16:42 ` Mike Travis
@ 2008-04-22 17:04 ` Mike Travis
2008-04-22 17:28 ` Luck, Tony
1 sibling, 1 reply; 17+ messages in thread
From: Mike Travis @ 2008-04-22 17:04 UTC (permalink / raw)
To: Tony Luck, Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
Tony Luck wrote:
> On Fri, Apr 4, 2008 at 6:11 PM, Mike Travis <travis@sgi.com> wrote:
>> @@ -7297,6 +7287,11 @@ void __init sched_init_smp(void)
>> #else
>> void __init sched_init_smp(void)
>> {
>> +#if defined(CONFIG_NUMA)
>> + sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
>> + GFP_KERNEL);
>> + BUG_ON(sched_group_nodes_bycpu == NULL);
>> +#endif
>> sched_init_granularity();
>> }
>> #endif /* CONFIG_SMP */
>
> This hunk is causing problems with one of my builds (generic,
> uniprocessor). Note
> that the #else at the start of this hunk is from a #ifdef CONFIG_SMP ... so I'm
> wondering why we need #if defined(CONFIG_NUMA) inside uniprocessor code :-)
> How can you have NUMA issues with only one cpu!!!
>>> CONFIG_NUMA is dependent on CONFIG_SMP for x86 so that caused me to miss
>>> the error. But here's the fix:
* sched_group_nodes_bycpu is defined only for the SMP + NUMA case,
it should not be referenced in the !SMP + NUMA case.
For inclusion into sched-devel/latest tree.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/sched.c | 5 -----
1 file changed, 5 deletions(-)
--- linux-2.6.sched.orig/kernel/sched.c
+++ linux-2.6.sched/kernel/sched.c
@@ -8028,11 +8028,6 @@ void __init sched_init_smp(void)
#else
void __init sched_init_smp(void)
{
-#if defined(CONFIG_NUMA)
- sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
- GFP_KERNEL);
- BUG_ON(sched_group_nodes_bycpu == NULL);
-#endif
sched_init_granularity();
}
#endif /* CONFIG_SMP */
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH 1/1] sched: remove unnecessary kzalloc in sched_init_smp
2008-04-22 17:04 ` [PATCH 1/1] sched: remove unnecessary kzalloc in sched_init_smp Mike Travis
@ 2008-04-22 17:28 ` Luck, Tony
0 siblings, 0 replies; 17+ messages in thread
From: Luck, Tony @ 2008-04-22 17:28 UTC (permalink / raw)
To: Mike Travis, Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, linux-kernel
Acked-by: Tony Luck <tony.luck@intel.com>
> For inclusion into sched-devel/latest tree.
Needs to head to Linus tree ... the preceeding patches have already
been merged there and are causing build errors for me.
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/sched.c | 5 -----
1 file changed, 5 deletions(-)
--- linux-2.6.sched.orig/kernel/sched.c
+++ linux-2.6.sched/kernel/sched.c
@@ -8028,11 +8028,6 @@ void __init sched_init_smp(void)
#else
void __init sched_init_smp(void)
{
-#if defined(CONFIG_NUMA)
- sched_group_nodes_bycpu = kzalloc(nr_cpu_ids * sizeof(void **),
- GFP_KERNEL);
- BUG_ON(sched_group_nodes_bycpu == NULL);
-#endif
sched_init_granularity();
}
#endif /* CONFIG_SMP */
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2008-04-22 17:28 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-05 1:11 [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3 Mike Travis
2008-04-05 1:11 ` [PATCH 01/12] x86: Convert cpumask_of_cpu macro to allocated array Mike Travis
2008-04-05 1:11 ` [PATCH 02/12] cpumask: add CPU_MASK_ALL_PTR macro Mike Travis
2008-04-05 1:11 ` [PATCH 03/12] cpumask: reduce stack pressure in cpu_coregroup_map Mike Travis
2008-04-05 1:11 ` [PATCH 04/12] sched: Remove fixed NR_CPUS sized arrays in kernel_sched_c v2 Mike Travis
2008-04-22 16:16 ` Tony Luck
2008-04-22 16:42 ` Mike Travis
2008-04-22 17:04 ` [PATCH 1/1] sched: remove unnecessary kzalloc in sched_init_smp Mike Travis
2008-04-22 17:28 ` Luck, Tony
2008-04-05 1:11 ` [PATCH 05/12] x86: use new set_cpus_allowed_ptr function Mike Travis
2008-04-05 1:11 ` [PATCH 06/12] generic: " Mike Travis
2008-04-05 1:11 ` [PATCH 07/12] cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer Mike Travis
2008-04-05 1:11 ` [PATCH 08/12] generic: reduce stack pressure in sched_affinity Mike Travis
2008-04-05 1:11 ` [PATCH 09/12] numa: move large array from stack to _initdata section Mike Travis
2008-04-05 1:11 ` [PATCH 10/12] nodemask: use new node_to_cpumask_ptr function Mike Travis
2008-04-05 1:11 ` [PATCH 11/12] cpumask: reduce stack usage in SD_x_INIT initializers Mike Travis
2008-04-05 1:11 ` [PATCH 12/12] cpumask: Cleanup more uses of CPU_MASK and NODE_MASK Mike Travis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox