* [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup @ 2008-01-17 22:35 ` travis 0 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel Fixup change NR_CPUS patchset by rebasing on 2.6.24-rc8-mm1 (from 2.6.24-rc6-mm1) and adding last minute changes suggested by reviews. Based on 2.6.24-rc8-mm1 Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- -- ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup @ 2008-01-17 22:35 ` travis 0 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel Fixup change NR_CPUS patchset by rebasing on 2.6.24-rc8-mm1 (from 2.6.24-rc6-mm1) and adding last minute changes suggested by reviews. Based on 2.6.24-rc8-mm1 Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/3] x86: Change size of node ids from u8 to u16 fixup 2008-01-17 22:35 ` travis @ 2008-01-17 22:35 ` travis -1 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel, Eric Dumazet [-- Attachment #1: big_nodeids-fixup --] [-- Type: text/plain, Size: 1898 bytes --] Change the size of node ids from 8 bits to 16 bits to accomodate more than 256 nodes. Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- Fixup: Size of memnode.embedded_map needs to be changed to accomodate 16-bit node ids as suggested by Eric. V2->V3: - changed memnode.embedded_map from [64-16] to [64-8] (and size comment to 128 bytes) V1->V2: - changed pxm_to_node_map to u16 - changed memnode map entries to u16 --- arch/x86/mm/numa_64.c | 2 +- drivers/acpi/numa.c | 2 +- include/asm-x86/mmzone_64.h | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) --- a/arch/x86/mm/numa_64.c +++ b/arch/x86/mm/numa_64.c @@ -88,7 +88,7 @@ static int __init allocate_cachealigned_ unsigned long pad, pad_addr; memnodemap = memnode.embedded_map; - if (memnodemapsize <= 48) + if (memnodemapsize <= ARRAY_SIZE(memnode.embedded_map)) return 0; pad = L1_CACHE_BYTES - 1; --- a/drivers/acpi/numa.c +++ b/drivers/acpi/numa.c @@ -38,7 +38,7 @@ ACPI_MODULE_NAME("numa"); static nodemask_t nodes_found_map = NODE_MASK_NONE; /* maps to convert between proximity domain and logical node ID */ -static int pxm_to_node_map[MAX_PXM_DOMAINS] +static u16 pxm_to_node_map[MAX_PXM_DOMAINS] = { [0 ... MAX_PXM_DOMAINS - 1] = NID_INVAL }; static int node_to_pxm_map[MAX_NUMNODES] = { [0 ... MAX_NUMNODES - 1] = PXM_INVAL }; --- a/include/asm-x86/mmzone_64.h +++ b/include/asm-x86/mmzone_64.h @@ -15,9 +15,9 @@ struct memnode { int shift; unsigned int mapsize; - u8 *map; - u8 embedded_map[64-16]; -} ____cacheline_aligned; /* total size = 64 bytes */ + u16 *map; + u16 embedded_map[64-8]; +} ____cacheline_aligned; /* total size = 128 bytes */ extern struct memnode memnode; #define memnode_shift memnode.shift #define memnodemap memnode.map -- ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/3] x86: Change size of node ids from u8 to u16 fixup @ 2008-01-17 22:35 ` travis 0 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel, Eric Dumazet [-- Attachment #1: big_nodeids-fixup --] [-- Type: text/plain, Size: 2124 bytes --] Change the size of node ids from 8 bits to 16 bits to accomodate more than 256 nodes. Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- Fixup: Size of memnode.embedded_map needs to be changed to accomodate 16-bit node ids as suggested by Eric. V2->V3: - changed memnode.embedded_map from [64-16] to [64-8] (and size comment to 128 bytes) V1->V2: - changed pxm_to_node_map to u16 - changed memnode map entries to u16 --- arch/x86/mm/numa_64.c | 2 +- drivers/acpi/numa.c | 2 +- include/asm-x86/mmzone_64.h | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) --- a/arch/x86/mm/numa_64.c +++ b/arch/x86/mm/numa_64.c @@ -88,7 +88,7 @@ static int __init allocate_cachealigned_ unsigned long pad, pad_addr; memnodemap = memnode.embedded_map; - if (memnodemapsize <= 48) + if (memnodemapsize <= ARRAY_SIZE(memnode.embedded_map)) return 0; pad = L1_CACHE_BYTES - 1; --- a/drivers/acpi/numa.c +++ b/drivers/acpi/numa.c @@ -38,7 +38,7 @@ ACPI_MODULE_NAME("numa"); static nodemask_t nodes_found_map = NODE_MASK_NONE; /* maps to convert between proximity domain and logical node ID */ -static int pxm_to_node_map[MAX_PXM_DOMAINS] +static u16 pxm_to_node_map[MAX_PXM_DOMAINS] = { [0 ... MAX_PXM_DOMAINS - 1] = NID_INVAL }; static int node_to_pxm_map[MAX_NUMNODES] = { [0 ... MAX_NUMNODES - 1] = PXM_INVAL }; --- a/include/asm-x86/mmzone_64.h +++ b/include/asm-x86/mmzone_64.h @@ -15,9 +15,9 @@ struct memnode { int shift; unsigned int mapsize; - u8 *map; - u8 embedded_map[64-16]; -} ____cacheline_aligned; /* total size = 64 bytes */ + u16 *map; + u16 embedded_map[64-8]; +} ____cacheline_aligned; /* total size = 128 bytes */ extern struct memnode memnode; #define memnode_shift memnode.shift #define memnodemap memnode.map -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/3] x86: Change NR_CPUS arrays in numa_64 fixup 2008-01-17 22:35 ` travis @ 2008-01-17 22:35 ` travis -1 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel [-- Attachment #1: NR_CPUS-arrays-in-numa_64-fixup --] [-- Type: text/plain, Size: 7574 bytes --] Change the following static arrays sized by NR_CPUS to per_cpu data variables: char cpu_to_node_map[NR_CPUS]; Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- fixup: Split cpu_to_node function into "early" and "late" versions so that x86_cpu_to_node_map_{init,early_ptr} are not EXPORT'ed. This also involves setting up the percpu maps as early as possible. V2->V3: - add early_cpu_to_node function to keep cpu_to_node efficient - move and rename smp_set_apicids() to setup_percpu_maps() - call setup_percpu_maps() as early as possible V1->V2: - Removed extraneous casts - Fix !NUMA builds with '#ifdef CONFIG_NUMA" --- arch/x86/kernel/setup64.c | 37 +++++++++++++++++++++++++++++++++++-- arch/x86/kernel/smpboot_64.c | 38 ++------------------------------------ arch/x86/mm/numa_64.c | 2 -- arch/x86/mm/srat_64.c | 5 +++-- include/asm-x86/numa_64.h | 7 ------- include/asm-x86/topology.h | 10 +++++++++- mm/page_alloc.c | 2 +- 7 files changed, 50 insertions(+), 51 deletions(-) --- a/arch/x86/kernel/setup64.c +++ b/arch/x86/kernel/setup64.c @@ -84,6 +84,36 @@ static int __init nonx32_setup(char *str __setup("noexec32=", nonx32_setup); /* + * Copy data used in early init routines from the initial arrays to the + * per cpu data areas. These arrays then become expendable and the *_ptrs + * are zeroed indicating that the static arrays are gone. + */ +void __init setup_percpu_maps(void) +{ + int cpu; + + for_each_possible_cpu(cpu) { + if (per_cpu_offset(cpu)) { + per_cpu(x86_cpu_to_apicid, cpu) = + x86_cpu_to_apicid_init[cpu]; +#ifdef CONFIG_NUMA + per_cpu(x86_cpu_to_node_map, cpu) = + x86_cpu_to_node_map_init[cpu]; +#endif + } + else + printk(KERN_NOTICE "per_cpu_offset zero for cpu %d\n", + cpu); + } + + /* indicate the early static arrays are gone */ + x86_cpu_to_apicid_early_ptr = NULL; +#ifdef CONFIG_NUMA + x86_cpu_to_node_map_early_ptr = NULL; +#endif +} + +/* * Great future plan: * Declare PDA itself and support (irqstack,tss,pgd) as per cpu data. * Always point %gs to its beginning @@ -104,18 +134,21 @@ void __init setup_per_cpu_areas(void) for_each_cpu_mask (i, cpu_possible_map) { char *ptr; - if (!NODE_DATA(cpu_to_node(i))) { + if (!NODE_DATA(early_cpu_to_node(i))) { printk("cpu with no node %d, num_online_nodes %d\n", i, num_online_nodes()); ptr = alloc_bootmem_pages(size); } else { - ptr = alloc_bootmem_pages_node(NODE_DATA(cpu_to_node(i)), size); + ptr = alloc_bootmem_pages_node(NODE_DATA(early_cpu_to_node(i)), size); } if (!ptr) panic("Cannot allocate cpu data for CPU %d\n", i); cpu_pda(i)->data_offset = ptr - __per_cpu_start; memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); } + + /* setup percpu data maps early */ + setup_percpu_maps(); } void pda_init(int cpu) --- a/arch/x86/kernel/smpboot_64.c +++ b/arch/x86/kernel/smpboot_64.c @@ -702,7 +702,7 @@ do_rest: if (boot_error) { cpu_clear(cpu, cpu_callout_map); /* was set here (do_boot_cpu()) */ clear_bit(cpu, (unsigned long *)&cpu_initialized); /* was set by cpu_init() */ - clear_node_cpumask(cpu); /* was set by numa_add_cpu */ + clear_bit(cpu, (unsigned long *)&node_to_cpumask_map[cpu_to_node(cpu)]); cpu_clear(cpu, cpu_present_map); cpu_clear(cpu, cpu_possible_map); per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID; @@ -851,39 +851,6 @@ static int __init smp_sanity_check(unsig return 0; } -/* - * Copy data used in early init routines from the initial arrays to the - * per cpu data areas. These arrays then become expendable and the - * *_ptrs are zeroed indicating that the static arrays are gone. - */ -void __init smp_set_apicids(void) -{ - int cpu; - - for_each_possible_cpu(cpu) { - if (per_cpu_offset(cpu)) { - per_cpu(x86_cpu_to_apicid, cpu) = - x86_cpu_to_apicid_init[cpu]; -#ifdef CONFIG_NUMA - per_cpu(x86_cpu_to_node_map, cpu) = - x86_cpu_to_node_map_init[cpu]; -#endif - per_cpu(x86_bios_cpu_apicid, cpu) = - x86_bios_cpu_apicid_init[cpu]; - } - else - printk(KERN_NOTICE "per_cpu_offset zero for cpu %d\n", - cpu); - } - - /* indicate the early static arrays are gone */ - x86_cpu_to_apicid_early_ptr = NULL; -#ifdef CONFIG_NUMA - x86_cpu_to_node_map_early_ptr = NULL; -#endif - x86_bios_cpu_apicid_early_ptr = NULL; -} - static void __init smp_cpu_index_default(void) { int i; @@ -906,7 +873,6 @@ void __init smp_prepare_cpus(unsigned in smp_cpu_index_default(); current_cpu_data = boot_cpu_data; current_thread_info()->cpu = 0; /* needed? */ - smp_set_apicids(); set_cpu_sibling_map(0); if (smp_sanity_check(max_cpus) < 0) { @@ -1060,7 +1026,7 @@ void remove_cpu_from_maps(void) cpu_clear(cpu, cpu_callout_map); cpu_clear(cpu, cpu_callin_map); clear_bit(cpu, (unsigned long *)&cpu_initialized); /* was set by cpu_init() */ - clear_node_cpumask(cpu); + clear_bit(cpu, (unsigned long *)&node_to_cpumask_map[cpu_to_node(cpu)]); } int __cpu_disable(void) --- a/arch/x86/mm/numa_64.c +++ b/arch/x86/mm/numa_64.c @@ -35,8 +35,6 @@ u16 x86_cpu_to_node_map_init[NR_CPUS] = [0 ... NR_CPUS-1] = NUMA_NO_NODE }; void *x86_cpu_to_node_map_early_ptr; -EXPORT_SYMBOL(x86_cpu_to_node_map_init); -EXPORT_SYMBOL(x86_cpu_to_node_map_early_ptr); DEFINE_PER_CPU(u16, x86_cpu_to_node_map) = NUMA_NO_NODE; EXPORT_PER_CPU_SYMBOL(x86_cpu_to_node_map); --- a/arch/x86/mm/srat_64.c +++ b/arch/x86/mm/srat_64.c @@ -382,9 +382,10 @@ int __init acpi_scan_nodes(unsigned long setup_node_bootmem(i, nodes[i].start, nodes[i].end); for (i = 0; i < NR_CPUS; i++) { - if (cpu_to_node(i) == NUMA_NO_NODE) + int node = cpu_to_node(i); + if (node == NUMA_NO_NODE) continue; - if (!node_isset(cpu_to_node(i), node_possible_map)) + if (!node_isset(node, node_possible_map)) numa_set_node(i, NUMA_NO_NODE); } numa_init_array(); --- a/include/asm-x86/numa_64.h +++ b/include/asm-x86/numa_64.h @@ -29,15 +29,8 @@ extern void setup_node_bootmem(int nodei #ifdef CONFIG_NUMA extern void __init init_cpu_to_node(void); - -static inline void clear_node_cpumask(int cpu) -{ - clear_bit(cpu, (unsigned long *)&node_to_cpumask_map[cpu_to_node(cpu)]); -} - #else #define init_cpu_to_node() do {} while (0) -#define clear_node_cpumask(cpu) do {} while (0) #endif #endif --- a/include/asm-x86/topology.h +++ b/include/asm-x86/topology.h @@ -38,7 +38,7 @@ extern cpumask_t node_to_cpumask_map[]; #define NUMA_NO_NODE ((u16)(~0)) /* Returns the number of the node containing CPU 'cpu' */ -static inline int cpu_to_node(int cpu) +static inline int early_cpu_to_node(int cpu) { u16 *cpu_to_node_map = x86_cpu_to_node_map_early_ptr; @@ -50,6 +50,14 @@ static inline int cpu_to_node(int cpu) return NUMA_NO_NODE; } +static inline int cpu_to_node(int cpu) +{ + if(per_cpu_offset(cpu)) + return per_cpu(x86_cpu_to_node_map, cpu); + else + return NUMA_NO_NODE; +} + /* * Returns the number of the node containing Node 'node'. This * architecture is flat, so it is a pretty simple function! --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1783,7 +1783,7 @@ EXPORT_SYMBOL(free_pages); static unsigned int nr_free_zone_pages(int offset) { /* Just pick one node, since fallback list is circular */ - pg_data_t *pgdat = NODE_DATA(numa_node_id()); + pg_data_t *pgdat = NODE_DATA(cpu_to_node(raw_smp_processor_id())); unsigned int sum = 0; struct zonelist *zonelist = pgdat->node_zonelists + offset; -- ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/3] x86: Change NR_CPUS arrays in numa_64 fixup @ 2008-01-17 22:35 ` travis 0 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel [-- Attachment #1: NR_CPUS-arrays-in-numa_64-fixup --] [-- Type: text/plain, Size: 7800 bytes --] Change the following static arrays sized by NR_CPUS to per_cpu data variables: char cpu_to_node_map[NR_CPUS]; Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- fixup: Split cpu_to_node function into "early" and "late" versions so that x86_cpu_to_node_map_{init,early_ptr} are not EXPORT'ed. This also involves setting up the percpu maps as early as possible. V2->V3: - add early_cpu_to_node function to keep cpu_to_node efficient - move and rename smp_set_apicids() to setup_percpu_maps() - call setup_percpu_maps() as early as possible V1->V2: - Removed extraneous casts - Fix !NUMA builds with '#ifdef CONFIG_NUMA" --- arch/x86/kernel/setup64.c | 37 +++++++++++++++++++++++++++++++++++-- arch/x86/kernel/smpboot_64.c | 38 ++------------------------------------ arch/x86/mm/numa_64.c | 2 -- arch/x86/mm/srat_64.c | 5 +++-- include/asm-x86/numa_64.h | 7 ------- include/asm-x86/topology.h | 10 +++++++++- mm/page_alloc.c | 2 +- 7 files changed, 50 insertions(+), 51 deletions(-) --- a/arch/x86/kernel/setup64.c +++ b/arch/x86/kernel/setup64.c @@ -84,6 +84,36 @@ static int __init nonx32_setup(char *str __setup("noexec32=", nonx32_setup); /* + * Copy data used in early init routines from the initial arrays to the + * per cpu data areas. These arrays then become expendable and the *_ptrs + * are zeroed indicating that the static arrays are gone. + */ +void __init setup_percpu_maps(void) +{ + int cpu; + + for_each_possible_cpu(cpu) { + if (per_cpu_offset(cpu)) { + per_cpu(x86_cpu_to_apicid, cpu) = + x86_cpu_to_apicid_init[cpu]; +#ifdef CONFIG_NUMA + per_cpu(x86_cpu_to_node_map, cpu) = + x86_cpu_to_node_map_init[cpu]; +#endif + } + else + printk(KERN_NOTICE "per_cpu_offset zero for cpu %d\n", + cpu); + } + + /* indicate the early static arrays are gone */ + x86_cpu_to_apicid_early_ptr = NULL; +#ifdef CONFIG_NUMA + x86_cpu_to_node_map_early_ptr = NULL; +#endif +} + +/* * Great future plan: * Declare PDA itself and support (irqstack,tss,pgd) as per cpu data. * Always point %gs to its beginning @@ -104,18 +134,21 @@ void __init setup_per_cpu_areas(void) for_each_cpu_mask (i, cpu_possible_map) { char *ptr; - if (!NODE_DATA(cpu_to_node(i))) { + if (!NODE_DATA(early_cpu_to_node(i))) { printk("cpu with no node %d, num_online_nodes %d\n", i, num_online_nodes()); ptr = alloc_bootmem_pages(size); } else { - ptr = alloc_bootmem_pages_node(NODE_DATA(cpu_to_node(i)), size); + ptr = alloc_bootmem_pages_node(NODE_DATA(early_cpu_to_node(i)), size); } if (!ptr) panic("Cannot allocate cpu data for CPU %d\n", i); cpu_pda(i)->data_offset = ptr - __per_cpu_start; memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); } + + /* setup percpu data maps early */ + setup_percpu_maps(); } void pda_init(int cpu) --- a/arch/x86/kernel/smpboot_64.c +++ b/arch/x86/kernel/smpboot_64.c @@ -702,7 +702,7 @@ do_rest: if (boot_error) { cpu_clear(cpu, cpu_callout_map); /* was set here (do_boot_cpu()) */ clear_bit(cpu, (unsigned long *)&cpu_initialized); /* was set by cpu_init() */ - clear_node_cpumask(cpu); /* was set by numa_add_cpu */ + clear_bit(cpu, (unsigned long *)&node_to_cpumask_map[cpu_to_node(cpu)]); cpu_clear(cpu, cpu_present_map); cpu_clear(cpu, cpu_possible_map); per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID; @@ -851,39 +851,6 @@ static int __init smp_sanity_check(unsig return 0; } -/* - * Copy data used in early init routines from the initial arrays to the - * per cpu data areas. These arrays then become expendable and the - * *_ptrs are zeroed indicating that the static arrays are gone. - */ -void __init smp_set_apicids(void) -{ - int cpu; - - for_each_possible_cpu(cpu) { - if (per_cpu_offset(cpu)) { - per_cpu(x86_cpu_to_apicid, cpu) = - x86_cpu_to_apicid_init[cpu]; -#ifdef CONFIG_NUMA - per_cpu(x86_cpu_to_node_map, cpu) = - x86_cpu_to_node_map_init[cpu]; -#endif - per_cpu(x86_bios_cpu_apicid, cpu) = - x86_bios_cpu_apicid_init[cpu]; - } - else - printk(KERN_NOTICE "per_cpu_offset zero for cpu %d\n", - cpu); - } - - /* indicate the early static arrays are gone */ - x86_cpu_to_apicid_early_ptr = NULL; -#ifdef CONFIG_NUMA - x86_cpu_to_node_map_early_ptr = NULL; -#endif - x86_bios_cpu_apicid_early_ptr = NULL; -} - static void __init smp_cpu_index_default(void) { int i; @@ -906,7 +873,6 @@ void __init smp_prepare_cpus(unsigned in smp_cpu_index_default(); current_cpu_data = boot_cpu_data; current_thread_info()->cpu = 0; /* needed? */ - smp_set_apicids(); set_cpu_sibling_map(0); if (smp_sanity_check(max_cpus) < 0) { @@ -1060,7 +1026,7 @@ void remove_cpu_from_maps(void) cpu_clear(cpu, cpu_callout_map); cpu_clear(cpu, cpu_callin_map); clear_bit(cpu, (unsigned long *)&cpu_initialized); /* was set by cpu_init() */ - clear_node_cpumask(cpu); + clear_bit(cpu, (unsigned long *)&node_to_cpumask_map[cpu_to_node(cpu)]); } int __cpu_disable(void) --- a/arch/x86/mm/numa_64.c +++ b/arch/x86/mm/numa_64.c @@ -35,8 +35,6 @@ u16 x86_cpu_to_node_map_init[NR_CPUS] = [0 ... NR_CPUS-1] = NUMA_NO_NODE }; void *x86_cpu_to_node_map_early_ptr; -EXPORT_SYMBOL(x86_cpu_to_node_map_init); -EXPORT_SYMBOL(x86_cpu_to_node_map_early_ptr); DEFINE_PER_CPU(u16, x86_cpu_to_node_map) = NUMA_NO_NODE; EXPORT_PER_CPU_SYMBOL(x86_cpu_to_node_map); --- a/arch/x86/mm/srat_64.c +++ b/arch/x86/mm/srat_64.c @@ -382,9 +382,10 @@ int __init acpi_scan_nodes(unsigned long setup_node_bootmem(i, nodes[i].start, nodes[i].end); for (i = 0; i < NR_CPUS; i++) { - if (cpu_to_node(i) == NUMA_NO_NODE) + int node = cpu_to_node(i); + if (node == NUMA_NO_NODE) continue; - if (!node_isset(cpu_to_node(i), node_possible_map)) + if (!node_isset(node, node_possible_map)) numa_set_node(i, NUMA_NO_NODE); } numa_init_array(); --- a/include/asm-x86/numa_64.h +++ b/include/asm-x86/numa_64.h @@ -29,15 +29,8 @@ extern void setup_node_bootmem(int nodei #ifdef CONFIG_NUMA extern void __init init_cpu_to_node(void); - -static inline void clear_node_cpumask(int cpu) -{ - clear_bit(cpu, (unsigned long *)&node_to_cpumask_map[cpu_to_node(cpu)]); -} - #else #define init_cpu_to_node() do {} while (0) -#define clear_node_cpumask(cpu) do {} while (0) #endif #endif --- a/include/asm-x86/topology.h +++ b/include/asm-x86/topology.h @@ -38,7 +38,7 @@ extern cpumask_t node_to_cpumask_map[]; #define NUMA_NO_NODE ((u16)(~0)) /* Returns the number of the node containing CPU 'cpu' */ -static inline int cpu_to_node(int cpu) +static inline int early_cpu_to_node(int cpu) { u16 *cpu_to_node_map = x86_cpu_to_node_map_early_ptr; @@ -50,6 +50,14 @@ static inline int cpu_to_node(int cpu) return NUMA_NO_NODE; } +static inline int cpu_to_node(int cpu) +{ + if(per_cpu_offset(cpu)) + return per_cpu(x86_cpu_to_node_map, cpu); + else + return NUMA_NO_NODE; +} + /* * Returns the number of the node containing Node 'node'. This * architecture is flat, so it is a pretty simple function! --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1783,7 +1783,7 @@ EXPORT_SYMBOL(free_pages); static unsigned int nr_free_zone_pages(int offset) { /* Just pick one node, since fallback list is circular */ - pg_data_t *pgdat = NODE_DATA(numa_node_id()); + pg_data_t *pgdat = NODE_DATA(cpu_to_node(raw_smp_processor_id())); unsigned int sum = 0; struct zonelist *zonelist = pgdat->node_zonelists + offset; -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/3] x86: Change bios_cpu_apicid to percpu data variable fixup 2008-01-17 22:35 ` travis @ 2008-01-17 22:35 ` travis -1 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel [-- Attachment #1: change-bios_cpu_apicid-to-percpu-fixup --] [-- Type: text/plain, Size: 2564 bytes --] Change static bios_cpu_apicid array to a per_cpu data variable. This includes using a static array used during initialization similar to the way x86_cpu_to_apicid[] is handled. There is one early use of bios_cpu_apicid in apic_is_clustered_box(). The other reference in cpu_present_to_apicid() is called after smp_set_apicids() has setup the percpu version of bios_cpu_apicid. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- V1->V2: - Removed extraneous casts - Add slight optimization to apic_is_clustered_box() [don't reference x86_bios_cpu_apicid_early_ptr each pass.] --- arch/x86/kernel/apic_64.c | 6 +++--- arch/x86/kernel/setup64.c | 3 +++ arch/x86/kernel/setup_64.c | 1 + 3 files changed, 7 insertions(+), 3 deletions(-) --- a/arch/x86/kernel/apic_64.c +++ b/arch/x86/kernel/apic_64.c @@ -1191,9 +1191,9 @@ __cpuinit int apic_is_clustered_box(void /* Problem: Partially populated chassis may not have CPUs in some of * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. Since - * clusters are allocated sequentially, count zeros only if they are - * bounded by ones. + * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. + * Since clusters are allocated sequentially, count zeros only if + * they are bounded by ones. */ clusters = 0; zeros = 0; --- a/arch/x86/kernel/setup64.c +++ b/arch/x86/kernel/setup64.c @@ -96,6 +96,8 @@ void __init setup_percpu_maps(void) if (per_cpu_offset(cpu)) { per_cpu(x86_cpu_to_apicid, cpu) = x86_cpu_to_apicid_init[cpu]; + per_cpu(x86_bios_cpu_apicid, cpu) = + x86_bios_cpu_apicid_init[cpu]; #ifdef CONFIG_NUMA per_cpu(x86_cpu_to_node_map, cpu) = x86_cpu_to_node_map_init[cpu]; @@ -108,6 +110,7 @@ void __init setup_percpu_maps(void) /* indicate the early static arrays are gone */ x86_cpu_to_apicid_early_ptr = NULL; + x86_bios_cpu_apicid_early_ptr = NULL; #ifdef CONFIG_NUMA x86_cpu_to_node_map_early_ptr = NULL; #endif --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -390,6 +390,7 @@ void __init setup_arch(char **cmdline_p) #ifdef CONFIG_SMP /* setup to use the early static init tables during kernel startup */ x86_cpu_to_apicid_early_ptr = (void *)&x86_cpu_to_apicid_init; + x86_bios_cpu_apicid_early_ptr = (void *)&x86_bios_cpu_apicid_init; #ifdef CONFIG_NUMA x86_cpu_to_node_map_early_ptr = (void *)&x86_cpu_to_node_map_init; #endif -- ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/3] x86: Change bios_cpu_apicid to percpu data variable fixup @ 2008-01-17 22:35 ` travis 0 siblings, 0 replies; 16+ messages in thread From: travis @ 2008-01-17 22:35 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel [-- Attachment #1: change-bios_cpu_apicid-to-percpu-fixup --] [-- Type: text/plain, Size: 2790 bytes --] Change static bios_cpu_apicid array to a per_cpu data variable. This includes using a static array used during initialization similar to the way x86_cpu_to_apicid[] is handled. There is one early use of bios_cpu_apicid in apic_is_clustered_box(). The other reference in cpu_present_to_apicid() is called after smp_set_apicids() has setup the percpu version of bios_cpu_apicid. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> --- V1->V2: - Removed extraneous casts - Add slight optimization to apic_is_clustered_box() [don't reference x86_bios_cpu_apicid_early_ptr each pass.] --- arch/x86/kernel/apic_64.c | 6 +++--- arch/x86/kernel/setup64.c | 3 +++ arch/x86/kernel/setup_64.c | 1 + 3 files changed, 7 insertions(+), 3 deletions(-) --- a/arch/x86/kernel/apic_64.c +++ b/arch/x86/kernel/apic_64.c @@ -1191,9 +1191,9 @@ __cpuinit int apic_is_clustered_box(void /* Problem: Partially populated chassis may not have CPUs in some of * the APIC clusters they have been allocated. Only present CPUs have - * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. Since - * clusters are allocated sequentially, count zeros only if they are - * bounded by ones. + * x86_bios_cpu_apicid entries, thus causing zeroes in the bitmap. + * Since clusters are allocated sequentially, count zeros only if + * they are bounded by ones. */ clusters = 0; zeros = 0; --- a/arch/x86/kernel/setup64.c +++ b/arch/x86/kernel/setup64.c @@ -96,6 +96,8 @@ void __init setup_percpu_maps(void) if (per_cpu_offset(cpu)) { per_cpu(x86_cpu_to_apicid, cpu) = x86_cpu_to_apicid_init[cpu]; + per_cpu(x86_bios_cpu_apicid, cpu) = + x86_bios_cpu_apicid_init[cpu]; #ifdef CONFIG_NUMA per_cpu(x86_cpu_to_node_map, cpu) = x86_cpu_to_node_map_init[cpu]; @@ -108,6 +110,7 @@ void __init setup_percpu_maps(void) /* indicate the early static arrays are gone */ x86_cpu_to_apicid_early_ptr = NULL; + x86_bios_cpu_apicid_early_ptr = NULL; #ifdef CONFIG_NUMA x86_cpu_to_node_map_early_ptr = NULL; #endif --- a/arch/x86/kernel/setup_64.c +++ b/arch/x86/kernel/setup_64.c @@ -390,6 +390,7 @@ void __init setup_arch(char **cmdline_p) #ifdef CONFIG_SMP /* setup to use the early static init tables during kernel startup */ x86_cpu_to_apicid_early_ptr = (void *)&x86_cpu_to_apicid_init; + x86_bios_cpu_apicid_early_ptr = (void *)&x86_bios_cpu_apicid_init; #ifdef CONFIG_NUMA x86_cpu_to_node_map_early_ptr = (void *)&x86_cpu_to_node_map_init; #endif -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup 2008-01-17 22:35 ` travis @ 2008-01-17 22:42 ` Mike Travis -1 siblings, 0 replies; 16+ messages in thread From: Mike Travis @ 2008-01-17 22:42 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel Hi Andrew, My automatic scripts accidentally sent this mail prematurely. Please hold off applying yet. Thanks, Mike travis@sgi.com wrote: > Fixup change NR_CPUS patchset by rebasing on 2.6.24-rc8-mm1 (from 2.6.24-rc6-mm1) > and adding last minute changes suggested by reviews. > > Based on 2.6.24-rc8-mm1 > > Signed-off-by: Mike Travis <travis@sgi.com> > Reviewed-by: Christoph Lameter <clameter@sgi.com> > --- > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup @ 2008-01-17 22:42 ` Mike Travis 0 siblings, 0 replies; 16+ messages in thread From: Mike Travis @ 2008-01-17 22:42 UTC (permalink / raw) To: Andrew Morton, Andi Kleen, mingo Cc: Christoph Lameter, linux-mm, linux-kernel Hi Andrew, My automatic scripts accidentally sent this mail prematurely. Please hold off applying yet. Thanks, Mike travis@sgi.com wrote: > Fixup change NR_CPUS patchset by rebasing on 2.6.24-rc8-mm1 (from 2.6.24-rc6-mm1) > and adding last minute changes suggested by reviews. > > Based on 2.6.24-rc8-mm1 > > Signed-off-by: Mike Travis <travis@sgi.com> > Reviewed-by: Christoph Lameter <clameter@sgi.com> > --- > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup 2008-01-17 22:42 ` Mike Travis @ 2008-01-18 9:23 ` Ingo Molnar -1 siblings, 0 replies; 16+ messages in thread From: Ingo Molnar @ 2008-01-18 9:23 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Andi Kleen, Christoph Lameter, linux-mm, linux-kernel * Mike Travis <travis@sgi.com> wrote: > Hi Andrew, > > My automatic scripts accidentally sent this mail prematurely. Please > hold off applying yet. I've picked it up for x86.git and i'll keep testing it (the patches seem straightforward) and will report any problems with the bite-head-off option unset. [ The 32-bit NUMA compile issue is orthogonal to these patches - it's due to the lack of 32-bit NUMA support in your changes :) That needs fixing before this could go into v2.6.25. ] Ingo ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup @ 2008-01-18 9:23 ` Ingo Molnar 0 siblings, 0 replies; 16+ messages in thread From: Ingo Molnar @ 2008-01-18 9:23 UTC (permalink / raw) To: Mike Travis Cc: Andrew Morton, Andi Kleen, Christoph Lameter, linux-mm, linux-kernel * Mike Travis <travis@sgi.com> wrote: > Hi Andrew, > > My automatic scripts accidentally sent this mail prematurely. Please > hold off applying yet. I've picked it up for x86.git and i'll keep testing it (the patches seem straightforward) and will report any problems with the bite-head-off option unset. [ The 32-bit NUMA compile issue is orthogonal to these patches - it's due to the lack of 32-bit NUMA support in your changes :) That needs fixing before this could go into v2.6.25. ] Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup 2008-01-18 9:23 ` Ingo Molnar @ 2008-01-18 12:59 ` Mike Travis -1 siblings, 0 replies; 16+ messages in thread From: Mike Travis @ 2008-01-18 12:59 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Andi Kleen, Christoph Lameter, linux-mm, linux-kernel Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> Hi Andrew, >> >> My automatic scripts accidentally sent this mail prematurely. Please >> hold off applying yet. > > I've picked it up for x86.git and i'll keep testing it (the patches seem > straightforward) and will report any problems with the bite-head-off > option unset. > > [ The 32-bit NUMA compile issue is orthogonal to these patches - it's > due to the lack of 32-bit NUMA support in your changes :) That needs > fixing before this could go into v2.6.25. ] > > Ingo I hadn't considered doing 32-bit NUMA changes as I didn't know if the NR_CPUS count would really be increased for the 32-bit architecture. I have been trying though not to break it. ;-) Thanks, Mike ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup @ 2008-01-18 12:59 ` Mike Travis 0 siblings, 0 replies; 16+ messages in thread From: Mike Travis @ 2008-01-18 12:59 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Andi Kleen, Christoph Lameter, linux-mm, linux-kernel Ingo Molnar wrote: > * Mike Travis <travis@sgi.com> wrote: > >> Hi Andrew, >> >> My automatic scripts accidentally sent this mail prematurely. Please >> hold off applying yet. > > I've picked it up for x86.git and i'll keep testing it (the patches seem > straightforward) and will report any problems with the bite-head-off > option unset. > > [ The 32-bit NUMA compile issue is orthogonal to these patches - it's > due to the lack of 32-bit NUMA support in your changes :) That needs > fixing before this could go into v2.6.25. ] > > Ingo I hadn't considered doing 32-bit NUMA changes as I didn't know if the NR_CPUS count would really be increased for the 32-bit architecture. I have been trying though not to break it. ;-) Thanks, Mike -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup 2008-01-18 12:59 ` Mike Travis @ 2008-01-18 18:54 ` Christoph Lameter -1 siblings, 0 replies; 16+ messages in thread From: Christoph Lameter @ 2008-01-18 18:54 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Andi Kleen, linux-mm, linux-kernel On Fri, 18 Jan 2008, Mike Travis wrote: > I hadn't considered doing 32-bit NUMA changes as I didn't know if the > NR_CPUS count would really be increased for the 32-bit architecture. > I have been trying though not to break it. ;-) 32bit NUMA is tricky because ZONE_NORMAL memory is only available on node 0. There have been thorny difficult to debug issues in the past... ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup @ 2008-01-18 18:54 ` Christoph Lameter 0 siblings, 0 replies; 16+ messages in thread From: Christoph Lameter @ 2008-01-18 18:54 UTC (permalink / raw) To: Mike Travis Cc: Ingo Molnar, Andrew Morton, Andi Kleen, linux-mm, linux-kernel On Fri, 18 Jan 2008, Mike Travis wrote: > I hadn't considered doing 32-bit NUMA changes as I didn't know if the > NR_CPUS count would really be increased for the 32-bit architecture. > I have been trying though not to break it. ;-) 32bit NUMA is tricky because ZONE_NORMAL memory is only available on node 0. There have been thorny difficult to debug issues in the past... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-01-18 18:55 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-17 22:35 [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup travis 2008-01-17 22:35 ` travis 2008-01-17 22:35 ` [PATCH 1/3] x86: Change size of node ids from u8 to u16 fixup travis 2008-01-17 22:35 ` travis 2008-01-17 22:35 ` [PATCH 2/3] x86: Change NR_CPUS arrays in numa_64 fixup travis 2008-01-17 22:35 ` travis 2008-01-17 22:35 ` [PATCH 3/3] x86: Change bios_cpu_apicid to percpu data variable fixup travis 2008-01-17 22:35 ` travis 2008-01-17 22:42 ` [PATCH 0/3] x86: Reduce memory and intra-node effects with large count NR_CPUs fixup Mike Travis 2008-01-17 22:42 ` Mike Travis 2008-01-18 9:23 ` Ingo Molnar 2008-01-18 9:23 ` Ingo Molnar 2008-01-18 12:59 ` Mike Travis 2008-01-18 12:59 ` Mike Travis 2008-01-18 18:54 ` Christoph Lameter 2008-01-18 18:54 ` Christoph Lameter
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.