linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3)
@ 2007-09-12  1:56 travis
  2007-09-12  1:56 ` [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3) travis
                   ` (11 more replies)
  0 siblings, 12 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter


Note:

This patch consolidates all the previous patches regarding
the conversion of static arrays sized by NR_CPUS into per_cpu
data arrays and is referenced against 2.6.23-rc6 .


v1 Intro:

In x86_64 and i386 architectures most arrays that are sized
using NR_CPUS lay in local memory on node 0.  Not only will most
(99%?) of the systems not use all the slots in these arrays,
particularly when NR_CPUS is increased to accommodate future
very high cpu count systems, but a number of cache lines are
passed unnecessarily on the system bus when these arrays are
referenced by cpus on other nodes.

Typically, the values in these arrays are referenced by the cpu
accessing it's own values, though when passing IPI interrupts,
the cpu does access the data relevant to the targeted cpu/node.
Of course, if the referencing cpu is not on node 0, then the
reference will still require cross node exchanges of cache
lines.  A common use of this is for an interrupt service
routine to pass the interrupt to other cpus local to that node.

Ideally, all the elements in these arrays should be moved to the
per_cpu data area.  In some cases (such as x86_cpu_to_apicid)
the array is referenced before the per_cpu data areas are setup.
In this case, a static array is declared in the __initdata
area and initialized by the booting cpu (BSP).  The values are
then moved to the per_cpu area after it is initialized and the
original static array is freed with the rest of the __initdata.
This patch is referenced against 2.6.23-rc6.
--

Changes for version v2:

> > Note the addtional change of the cpu_llc_id type from u8
> > to int for ARCH x86_64 to correspond with ARCH i386.

> At least currently it cannot be more than 8 bit. So why
> waste memory? It would be better to change i386

Done.  (x86_64 type => u8).

> > Fix four instances where cpu_to_node is referenced
> > > by array instead of via the cpu_to_node macro.  This
> > > is preparation to moving it to the per_cpu data area.

> Shouldn't this patch be logically before the per cpu 
> conversion (which is 3). This way the result would
> be git bisectable.

Done.  (Moved to PATCH 1).

> >     processor_core.c currently tries to determine the apicid by special casing
> > >     for IA64 and x86. The desired information is readily available via
> > > 
> > > 	    cpu_physical_id()
> > > 
> > >     on IA64, i386 and x86_64.
> 
> Have you tried this with a !CONFIG_SMP build? The drivers/dma code was doing
> the same and running into problems because it wasn't defined there.

Fixed. (New export in PATCH 1).
--

Changes for version v3:

cpu_sibling_map has been converted to a per_cpu data array to fix
build errors on ia64, ppc64 and sparc64 to accomodate references in
block/blktrace.c and kernel/sched.c when CONFIG_SCHED_SMT is defined.

Warning: ppc64 and sparc64 have not yet been built nor tested.
--

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 02/10] x86: fix cpu_to_node references (v3) travis
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

This is a copy of an older patch that is in rc3-mm1.  It's needed
to allow the remaining patches to integrate correctly.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86_64/kernel/genapic.c      |    2 --
 arch/x86_64/kernel/genapic_flat.c |    1 -
 arch/x86_64/kernel/smpboot.c      |    1 -
 include/asm-x86_64/smp.h          |    1 -
 4 files changed, 5 deletions(-)

--- a/arch/x86_64/kernel/genapic.c
+++ b/arch/x86_64/kernel/genapic.c
@@ -29,8 +29,6 @@
 					= { [0 ... NR_CPUS-1] = BAD_APICID };
 EXPORT_SYMBOL(x86_cpu_to_apicid);
 
-u8 x86_cpu_to_log_apicid[NR_CPUS]	= { [0 ... NR_CPUS-1] = BAD_APICID };
-
 struct genapic __read_mostly *genapic = &apic_flat;
 
 /*
--- a/arch/x86_64/kernel/genapic_flat.c
+++ b/arch/x86_64/kernel/genapic_flat.c
@@ -52,7 +52,6 @@
 
 	num = smp_processor_id();
 	id = 1UL << num;
-	x86_cpu_to_log_apicid[num] = id;
 	apic_write(APIC_DFR, APIC_DFR_FLAT);
 	val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
 	val |= SET_APIC_LOGICAL_ID(id);
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -702,7 +702,6 @@
 		cpu_clear(cpu, cpu_present_map);
 		cpu_clear(cpu, cpu_possible_map);
 		x86_cpu_to_apicid[cpu] = BAD_APICID;
-		x86_cpu_to_log_apicid[cpu] = BAD_APICID;
 		return -EIO;
 	}
 
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -78,7 +78,6 @@
  * the real APIC ID <-> CPU # mapping.
  */
 extern u8 x86_cpu_to_apicid[NR_CPUS];	/* physical ID */
-extern u8 x86_cpu_to_log_apicid[NR_CPUS];
 extern u8 bios_cpu_apicid[];
 
 static inline int cpu_present_to_apicid(int mps_cpu)

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 02/10] x86: fix cpu_to_node references (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
  2007-09-12  1:56 ` [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3) travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3) travis
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

Fix four instances where cpu_to_node is referenced
by array instead of via the cpu_to_node macro.  This
is preparation to moving it to the per_cpu data area.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86_64/kernel/vsyscall.c |    2 +-
 arch/x86_64/mm/numa.c         |    4 ++--
 arch/x86_64/mm/srat.c         |    4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

--- a/arch/x86_64/kernel/vsyscall.c
+++ b/arch/x86_64/kernel/vsyscall.c
@@ -291,7 +291,7 @@
 	unsigned long *d;
 	unsigned long node = 0;
 #ifdef CONFIG_NUMA
-	node = cpu_to_node[cpu];
+	node = cpu_to_node(cpu);
 #endif
 	if (cpu_has(&cpu_data[cpu], X86_FEATURE_RDTSCP))
 		write_rdtscp_aux((node << 12) | cpu);
--- a/arch/x86_64/mm/numa.c
+++ b/arch/x86_64/mm/numa.c
@@ -261,7 +261,7 @@
 	   We round robin the existing nodes. */
 	rr = first_node(node_online_map);
 	for (i = 0; i < NR_CPUS; i++) {
-		if (cpu_to_node[i] != NUMA_NO_NODE)
+		if (cpu_to_node(i) != NUMA_NO_NODE)
 			continue;
  		numa_set_node(i, rr);
 		rr = next_node(rr, node_online_map);
@@ -543,7 +543,7 @@
 void __cpuinit numa_set_node(int cpu, int node)
 {
 	cpu_pda(cpu)->nodenumber = node;
-	cpu_to_node[cpu] = node;
+	cpu_to_node(cpu) = node;
 }
 
 unsigned long __init numa_free_all_bootmem(void) 
--- a/arch/x86_64/mm/srat.c
+++ b/arch/x86_64/mm/srat.c
@@ -431,9 +431,9 @@
 			setup_node_bootmem(i, nodes[i].start, nodes[i].end);
 
 	for (i = 0; i < NR_CPUS; i++) {
-		if (cpu_to_node[i] == NUMA_NO_NODE)
+		if (cpu_to_node(i) == NUMA_NO_NODE)
 			continue;
-		if (!node_isset(cpu_to_node[i], node_possible_map))
+		if (!node_isset(cpu_to_node(i), node_possible_map))
 			numa_set_node(i, NUMA_NO_NODE);
 	}
 	numa_init_array();

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
  2007-09-12  1:56 ` [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3) travis
  2007-09-12  1:56 ` [PATCH 02/10] x86: fix cpu_to_node references (v3) travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 04/10] x86: Convert cpu_sibling_map " travis
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

This is from an earlier message from 'Christoph Lameter':

    cpu_core_map is currently an array defined using NR_CPUS. This means that
    we overallocate since we will rarely really use maximum configured cpu.

    If we put the cpu_core_map into the per cpu area then it will be allocated
    for each processor as it comes online.

    This means that the core map cannot be accessed until the per cpu area
    has been allocated. Xen does a weird thing here looping over all processors
    and zeroing the masks that are not yet allocated and that will be zeroed
    when they are allocated. I commented the code out.

    Signed-off-by: Christoph Lameter <clameter@sgi.com>

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c |    2 -
 arch/i386/kernel/cpu/cpufreq/powernow-k8.c  |   10 ++++----
 arch/i386/kernel/cpu/proc.c                 |    3 +-
 arch/i386/kernel/smpboot.c                  |   34 ++++++++++++++--------------
 arch/i386/xen/smp.c                         |   14 +++++++++--
 arch/x86_64/kernel/mce_amd.c                |    6 ++--
 arch/x86_64/kernel/setup.c                  |    3 +-
 arch/x86_64/kernel/smpboot.c                |   24 +++++++++----------
 include/asm-i386/smp.h                      |    2 -
 include/asm-i386/topology.h                 |    2 -
 include/asm-x86_64/smp.h                    |    8 +++++-
 include/asm-x86_64/topology.h               |    2 -
 12 files changed, 64 insertions(+), 46 deletions(-)

--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -39,7 +39,13 @@
 extern void smp_send_reschedule(int cpu);
 
 extern cpumask_t cpu_sibling_map[NR_CPUS];
-extern cpumask_t cpu_core_map[NR_CPUS];
+/*
+ * cpu_core_map lives in a per cpu area
+ *
+ * extern cpumask_t cpu_core_map[NR_CPUS];
+ */
+DECLARE_PER_CPU(cpumask_t, cpu_core_map);
+
 extern u8 cpu_llc_id[NR_CPUS];
 
 #define SMP_TRAMPOLINE_BASE 0x6000
--- a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -595,7 +595,7 @@
 	dmi_check_system(sw_any_bug_dmi_table);
 	if (bios_with_sw_any_bug && cpus_weight(policy->cpus) == 1) {
 		policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
-		policy->cpus = cpu_core_map[cpu];
+		policy->cpus = per_cpu(cpu_core_map, cpu);
 	}
 #endif
 
--- a/arch/i386/kernel/cpu/cpufreq/powernow-k8.c
+++ b/arch/i386/kernel/cpu/cpufreq/powernow-k8.c
@@ -57,7 +57,7 @@
 static int cpu_family = CPU_OPTERON;
 
 #ifndef CONFIG_SMP
-static cpumask_t cpu_core_map[1];
+DEFINE_PER_CPU(cpumask_t, cpu_core_map);
 #endif
 
 /* Return a frequency in MHz, given an input fid */
@@ -664,7 +664,7 @@
 
 	dprintk("cfid 0x%x, cvid 0x%x\n", data->currfid, data->currvid);
 	data->powernow_table = powernow_table;
-	if (first_cpu(cpu_core_map[data->cpu]) == data->cpu)
+	if (first_cpu(per_cpu(cpu_core_map, data->cpu)) == data->cpu)
 		print_basics(data);
 
 	for (j = 0; j < data->numps; j++)
@@ -818,7 +818,7 @@
 
 	/* fill in data */
 	data->numps = data->acpi_data.state_count;
-	if (first_cpu(cpu_core_map[data->cpu]) == data->cpu)
+	if (first_cpu(per_cpu(cpu_core_map, data->cpu)) == data->cpu)
 		print_basics(data);
 	powernow_k8_acpi_pst_values(data, 0);
 
@@ -1212,7 +1212,7 @@
 	if (cpu_family == CPU_HW_PSTATE)
 		pol->cpus = cpumask_of_cpu(pol->cpu);
 	else
-		pol->cpus = cpu_core_map[pol->cpu];
+		pol->cpus = per_cpu(cpu_core_map, pol->cpu);
 	data->available_cores = &(pol->cpus);
 
 	/* Take a crude guess here.
@@ -1279,7 +1279,7 @@
 	cpumask_t oldmask = current->cpus_allowed;
 	unsigned int khz = 0;
 
-	data = powernow_data[first_cpu(cpu_core_map[cpu])];
+	data = powernow_data[first_cpu(per_cpu(cpu_core_map, cpu))];
 
 	if (!data)
 		return -EINVAL;
--- a/arch/i386/kernel/cpu/proc.c
+++ b/arch/i386/kernel/cpu/proc.c
@@ -122,7 +122,8 @@
 #ifdef CONFIG_X86_HT
 	if (c->x86_max_cores * smp_num_siblings > 1) {
 		seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
-		seq_printf(m, "siblings\t: %d\n", cpus_weight(cpu_core_map[n]));
+		seq_printf(m, "siblings\t: %d\n",
+				cpus_weight(per_cpu(cpu_core_map, n)));
 		seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
 		seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
 	}
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -74,8 +74,8 @@
 EXPORT_SYMBOL(cpu_sibling_map);
 
 /* representing HT and core siblings of each logical CPU */
-cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_core_map);
+DEFINE_PER_CPU(cpumask_t, cpu_core_map);
+EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 
 /* bitmap of online cpus */
 cpumask_t cpu_online_map __read_mostly;
@@ -300,7 +300,7 @@
 	 * And for power savings, we return cpu_core_map
 	 */
 	if (sched_mc_power_savings || sched_smt_power_savings)
-		return cpu_core_map[cpu];
+		return per_cpu(cpu_core_map, cpu);
 	else
 		return c->llc_shared_map;
 }
@@ -321,8 +321,8 @@
 			    c[cpu].cpu_core_id == c[i].cpu_core_id) {
 				cpu_set(i, cpu_sibling_map[cpu]);
 				cpu_set(cpu, cpu_sibling_map[i]);
-				cpu_set(i, cpu_core_map[cpu]);
-				cpu_set(cpu, cpu_core_map[i]);
+				cpu_set(i, per_cpu(cpu_core_map, cpu));
+				cpu_set(cpu, per_cpu(cpu_core_map, i));
 				cpu_set(i, c[cpu].llc_shared_map);
 				cpu_set(cpu, c[i].llc_shared_map);
 			}
@@ -334,7 +334,7 @@
 	cpu_set(cpu, c[cpu].llc_shared_map);
 
 	if (current_cpu_data.x86_max_cores == 1) {
-		cpu_core_map[cpu] = cpu_sibling_map[cpu];
+		per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
 		c[cpu].booted_cores = 1;
 		return;
 	}
@@ -346,8 +346,8 @@
 			cpu_set(cpu, c[i].llc_shared_map);
 		}
 		if (c[cpu].phys_proc_id == c[i].phys_proc_id) {
-			cpu_set(i, cpu_core_map[cpu]);
-			cpu_set(cpu, cpu_core_map[i]);
+			cpu_set(i, per_cpu(cpu_core_map, cpu));
+			cpu_set(cpu, per_cpu(cpu_core_map, i));
 			/*
 			 *  Does this new cpu bringup a new core?
 			 */
@@ -984,7 +984,7 @@
 					   " Using dummy APIC emulation.\n");
 		map_cpu_to_logical_apicid();
 		cpu_set(0, cpu_sibling_map[0]);
-		cpu_set(0, cpu_core_map[0]);
+		cpu_set(0, per_cpu(cpu_core_map, 0));
 		return;
 	}
 
@@ -1009,7 +1009,7 @@
 		smpboot_clear_io_apic_irqs();
 		phys_cpu_present_map = physid_mask_of_physid(0);
 		cpu_set(0, cpu_sibling_map[0]);
-		cpu_set(0, cpu_core_map[0]);
+		cpu_set(0, per_cpu(cpu_core_map, 0));
 		return;
 	}
 
@@ -1024,7 +1024,7 @@
 		smpboot_clear_io_apic_irqs();
 		phys_cpu_present_map = physid_mask_of_physid(0);
 		cpu_set(0, cpu_sibling_map[0]);
-		cpu_set(0, cpu_core_map[0]);
+		cpu_set(0, per_cpu(cpu_core_map, 0));
 		return;
 	}
 
@@ -1107,11 +1107,11 @@
 	 */
 	for (cpu = 0; cpu < NR_CPUS; cpu++) {
 		cpus_clear(cpu_sibling_map[cpu]);
-		cpus_clear(cpu_core_map[cpu]);
+		cpus_clear(per_cpu(cpu_core_map, cpu));
 	}
 
 	cpu_set(0, cpu_sibling_map[0]);
-	cpu_set(0, cpu_core_map[0]);
+	cpu_set(0, per_cpu(cpu_core_map, 0));
 
 	smpboot_setup_io_apic();
 
@@ -1148,9 +1148,9 @@
 	int sibling;
 	struct cpuinfo_x86 *c = cpu_data;
 
-	for_each_cpu_mask(sibling, cpu_core_map[cpu]) {
-		cpu_clear(cpu, cpu_core_map[sibling]);
-		/*
+	for_each_cpu_mask(sibling, per_cpu(cpu_core_map, cpu)) {
+		cpu_clear(cpu, per_cpu(cpu_core_map, sibling));
+		/*/
 		 * last thread sibling in this cpu core going down
 		 */
 		if (cpus_weight(cpu_sibling_map[cpu]) == 1)
@@ -1160,7 +1160,7 @@
 	for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
 		cpu_clear(cpu, cpu_sibling_map[sibling]);
 	cpus_clear(cpu_sibling_map[cpu]);
-	cpus_clear(cpu_core_map[cpu]);
+	cpus_clear(per_cpu(cpu_core_map, cpu));
 	c[cpu].phys_proc_id = 0;
 	c[cpu].cpu_core_id = 0;
 	cpu_clear(cpu, cpu_sibling_setup_map);
--- a/arch/i386/xen/smp.c
+++ b/arch/i386/xen/smp.c
@@ -148,7 +148,12 @@
 
 	for (cpu = 0; cpu < NR_CPUS; cpu++) {
 		cpus_clear(cpu_sibling_map[cpu]);
-		cpus_clear(cpu_core_map[cpu]);
+		/*
+		 * cpu_core_map lives in a per cpu area that is cleared
+		 * when the per cpu array is allocated.
+		 *
+		 * cpus_clear(per_cpu(cpu_core_map, cpu));
+		 */
 	}
 
 	xen_setup_vcpu_info_placement();
@@ -160,7 +165,12 @@
 
 	for (cpu = 0; cpu < NR_CPUS; cpu++) {
 		cpus_clear(cpu_sibling_map[cpu]);
-		cpus_clear(cpu_core_map[cpu]);
+		/*
+		 * cpu_core_ map will be zeroed when the per
+		 * cpu area is allocated.
+		 *
+		 * cpus_clear(per_cpu(cpu_core_map, cpu));
+		 */
 	}
 
 	smp_store_cpu_info(0);
--- a/arch/x86_64/kernel/mce_amd.c
+++ b/arch/x86_64/kernel/mce_amd.c
@@ -472,7 +472,7 @@
 
 #ifdef CONFIG_SMP
 	if (cpu_data[cpu].cpu_core_id && shared_bank[bank]) {	/* symlink */
-		i = first_cpu(cpu_core_map[cpu]);
+		i = first_cpu(per_cpu(cpu_core_map, cpu));
 
 		/* first core not up yet */
 		if (cpu_data[i].cpu_core_id)
@@ -492,7 +492,7 @@
 		if (err)
 			goto out;
 
-		b->cpus = cpu_core_map[cpu];
+		b->cpus = per_cpu(cpu_core_map, cpu);
 		per_cpu(threshold_banks, cpu)[bank] = b;
 		goto out;
 	}
@@ -509,7 +509,7 @@
 #ifndef CONFIG_SMP
 	b->cpus = CPU_MASK_ALL;
 #else
-	b->cpus = cpu_core_map[cpu];
+	b->cpus = per_cpu(cpu_core_map, cpu);
 #endif
 	err = kobject_register(&b->kobj);
 	if (err)
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -1041,7 +1041,8 @@
 	if (smp_num_siblings * c->x86_max_cores > 1) {
 		int cpu = c - cpu_data;
 		seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
-		seq_printf(m, "siblings\t: %d\n", cpus_weight(cpu_core_map[cpu]));
+		seq_printf(m, "siblings\t: %d\n",
+			       cpus_weight(per_cpu(cpu_core_map, cpu)));
 		seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
 		seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
 	}
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -95,8 +95,8 @@
 EXPORT_SYMBOL(cpu_sibling_map);
 
 /* representing HT and core siblings of each logical CPU */
-cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_core_map);
+DEFINE_PER_CPU(cpumask_t, cpu_core_map);
+EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 
 /*
  * Trampoline 80x86 program as an array.
@@ -245,7 +245,7 @@
 	 * And for power savings, we return cpu_core_map
 	 */
 	if (sched_mc_power_savings || sched_smt_power_savings)
-		return cpu_core_map[cpu];
+		return per_cpu(cpu_core_map, cpu);
 	else
 		return c->llc_shared_map;
 }
@@ -266,8 +266,8 @@
 			    c[cpu].cpu_core_id == c[i].cpu_core_id) {
 				cpu_set(i, cpu_sibling_map[cpu]);
 				cpu_set(cpu, cpu_sibling_map[i]);
-				cpu_set(i, cpu_core_map[cpu]);
-				cpu_set(cpu, cpu_core_map[i]);
+				cpu_set(i, per_cpu(cpu_core_map, cpu));
+				cpu_set(cpu, per_cpu(cpu_core_map, i));
 				cpu_set(i, c[cpu].llc_shared_map);
 				cpu_set(cpu, c[i].llc_shared_map);
 			}
@@ -279,7 +279,7 @@
 	cpu_set(cpu, c[cpu].llc_shared_map);
 
 	if (current_cpu_data.x86_max_cores == 1) {
-		cpu_core_map[cpu] = cpu_sibling_map[cpu];
+		per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
 		c[cpu].booted_cores = 1;
 		return;
 	}
@@ -291,8 +291,8 @@
 			cpu_set(cpu, c[i].llc_shared_map);
 		}
 		if (c[cpu].phys_proc_id == c[i].phys_proc_id) {
-			cpu_set(i, cpu_core_map[cpu]);
-			cpu_set(cpu, cpu_core_map[i]);
+			cpu_set(i, per_cpu(cpu_core_map, cpu));
+			cpu_set(cpu, per_cpu(cpu_core_map, i));
 			/*
 			 *  Does this new cpu bringup a new core?
 			 */
@@ -742,7 +742,7 @@
 	else
 		phys_cpu_present_map = physid_mask_of_physid(0);
 	cpu_set(0, cpu_sibling_map[0]);
-	cpu_set(0, cpu_core_map[0]);
+	cpu_set(0, per_cpu(cpu_core_map, 0));
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -977,8 +977,8 @@
 	int sibling;
 	struct cpuinfo_x86 *c = cpu_data;
 
-	for_each_cpu_mask(sibling, cpu_core_map[cpu]) {
-		cpu_clear(cpu, cpu_core_map[sibling]);
+	for_each_cpu_mask(sibling, per_cpu(cpu_core_map, cpu)) {
+		cpu_clear(cpu, per_cpu(cpu_core_map, sibling));
 		/*
 		 * last thread sibling in this cpu core going down
 		 */
@@ -989,7 +989,7 @@
 	for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
 		cpu_clear(cpu, cpu_sibling_map[sibling]);
 	cpus_clear(cpu_sibling_map[cpu]);
-	cpus_clear(cpu_core_map[cpu]);
+	cpus_clear(per_cpu(cpu_core_map, cpu));
 	c[cpu].phys_proc_id = 0;
 	c[cpu].cpu_core_id = 0;
 	cpu_clear(cpu, cpu_sibling_setup_map);
--- a/include/asm-i386/smp.h
+++ b/include/asm-i386/smp.h
@@ -31,7 +31,7 @@
 extern int pic_mode;
 extern int smp_num_siblings;
 extern cpumask_t cpu_sibling_map[];
-extern cpumask_t cpu_core_map[];
+DECLARE_PER_CPU(cpumask_t, cpu_core_map);
 
 extern void (*mtrr_hook) (void);
 extern void zap_low_mappings (void);
--- a/include/asm-i386/topology.h
+++ b/include/asm-i386/topology.h
@@ -30,7 +30,7 @@
 #ifdef CONFIG_X86_HT
 #define topology_physical_package_id(cpu)	(cpu_data[cpu].phys_proc_id)
 #define topology_core_id(cpu)			(cpu_data[cpu].cpu_core_id)
-#define topology_core_siblings(cpu)		(cpu_core_map[cpu])
+#define topology_core_siblings(cpu)		(per_cpu(cpu_core_map, cpu))
 #define topology_thread_siblings(cpu)		(cpu_sibling_map[cpu])
 #endif
 
--- a/include/asm-x86_64/topology.h
+++ b/include/asm-x86_64/topology.h
@@ -58,7 +58,7 @@
 #ifdef CONFIG_SMP
 #define topology_physical_package_id(cpu)	(cpu_data[cpu].phys_proc_id)
 #define topology_core_id(cpu)			(cpu_data[cpu].cpu_core_id)
-#define topology_core_siblings(cpu)		(cpu_core_map[cpu])
+#define topology_core_siblings(cpu)		(per_cpu(cpu_core_map, cpu))
 #define topology_thread_siblings(cpu)		(cpu_sibling_map[cpu])
 #define mc_capable()			(boot_cpu_data.x86_max_cores > 1)
 #define smt_capable() 			(smp_num_siblings > 1)

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 04/10] x86: Convert cpu_sibling_map to be a per cpu variable (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (2 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3) travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 05/10] x86: Convert x86_cpu_to_apicid " travis
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

Convert cpu_sibling_map from a static array sized by NR_CPUS to a
per_cpu variable.  This saves sizeof(cpumask_t) * NR unused cpus.
Access is mostly from startup and CPU HOTPLUG functions.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/i386/kernel/cpu/cpufreq/p4-clockmod.c   |    2 -
 arch/i386/kernel/cpu/cpufreq/speedstep-ich.c |    2 -
 arch/i386/kernel/io_apic.c                   |    4 +--
 arch/i386/kernel/smpboot.c                   |   36 +++++++++++++--------------
 arch/i386/oprofile/op_model_p4.c             |    2 -
 arch/i386/xen/smp.c                          |    4 +--
 arch/x86_64/kernel/smpboot.c                 |   26 +++++++++----------
 block/blktrace.c                             |    2 -
 include/asm-i386/smp.h                       |    2 -
 include/asm-i386/topology.h                  |    2 -
 include/asm-x86_64/smp.h                     |    6 +++-
 include/asm-x86_64/topology.h                |    2 -
 kernel/sched.c                               |    8 +++---
 13 files changed, 50 insertions(+), 48 deletions(-)

--- a/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
+++ b/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
@@ -200,7 +200,7 @@
 	unsigned int i;
 
 #ifdef CONFIG_SMP
-	policy->cpus = cpu_sibling_map[policy->cpu];
+	policy->cpus = per_cpu(cpu_sibling_map, policy->cpu);
 #endif
 
 	/* Errata workaround */
--- a/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c
+++ b/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c
@@ -322,7 +322,7 @@
 
 	/* only run on CPU to be set, or on its sibling */
 #ifdef CONFIG_SMP
-	policy->cpus = cpu_sibling_map[policy->cpu];
+	policy->cpus = per_cpu(cpu_sibling_map, policy->cpu);
 #endif
 
 	cpus_allowed = current->cpus_allowed;
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -378,7 +378,7 @@
 
 #define IRQ_ALLOWED(cpu, allowed_mask)	cpu_isset(cpu, allowed_mask)
 
-#define CPU_TO_PACKAGEINDEX(i) (first_cpu(cpu_sibling_map[i]))
+#define CPU_TO_PACKAGEINDEX(i) (first_cpu(per_cpu(cpu_sibling_map, i)))
 
 static cpumask_t balance_irq_affinity[NR_IRQS] = {
 	[0 ... NR_IRQS-1] = CPU_MASK_ALL
@@ -598,7 +598,7 @@
 	 * (A+B)/2 vs B
 	 */
 	load = CPU_IRQ(min_loaded) >> 1;
-	for_each_cpu_mask(j, cpu_sibling_map[min_loaded]) {
+	for_each_cpu_mask(j, per_cpu(cpu_sibling_map, min_loaded)) {
 		if (load > CPU_IRQ(j)) {
 			/* This won't change cpu_sibling_map[min_loaded] */
 			load = CPU_IRQ(j);
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -70,8 +70,8 @@
 int cpu_llc_id[NR_CPUS] __cpuinitdata = {[0 ... NR_CPUS-1] = BAD_APICID};
 
 /* representing HT siblings of each logical CPU */
-cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 
 /* representing HT and core siblings of each logical CPU */
 DEFINE_PER_CPU(cpumask_t, cpu_core_map);
@@ -319,8 +319,8 @@
 		for_each_cpu_mask(i, cpu_sibling_setup_map) {
 			if (c[cpu].phys_proc_id == c[i].phys_proc_id &&
 			    c[cpu].cpu_core_id == c[i].cpu_core_id) {
-				cpu_set(i, cpu_sibling_map[cpu]);
-				cpu_set(cpu, cpu_sibling_map[i]);
+				cpu_set(i, per_cpu(cpu_sibling_map, cpu));
+				cpu_set(cpu, per_cpu(cpu_sibling_map, i));
 				cpu_set(i, per_cpu(cpu_core_map, cpu));
 				cpu_set(cpu, per_cpu(cpu_core_map, i));
 				cpu_set(i, c[cpu].llc_shared_map);
@@ -328,13 +328,13 @@
 			}
 		}
 	} else {
-		cpu_set(cpu, cpu_sibling_map[cpu]);
+		cpu_set(cpu, per_cpu(cpu_sibling_map, cpu));
 	}
 
 	cpu_set(cpu, c[cpu].llc_shared_map);
 
 	if (current_cpu_data.x86_max_cores == 1) {
-		per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
+		per_cpu(cpu_core_map, cpu) = per_cpu(cpu_sibling_map, cpu);
 		c[cpu].booted_cores = 1;
 		return;
 	}
@@ -351,12 +351,12 @@
 			/*
 			 *  Does this new cpu bringup a new core?
 			 */
-			if (cpus_weight(cpu_sibling_map[cpu]) == 1) {
+			if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1) {
 				/*
 				 * for each core in package, increment
 				 * the booted_cores for this new cpu
 				 */
-				if (first_cpu(cpu_sibling_map[i]) == i)
+				if (first_cpu(per_cpu(cpu_sibling_map, i)) == i)
 					c[cpu].booted_cores++;
 				/*
 				 * increment the core count for all
@@ -983,7 +983,7 @@
 			printk(KERN_NOTICE "Local APIC not detected."
 					   " Using dummy APIC emulation.\n");
 		map_cpu_to_logical_apicid();
-		cpu_set(0, cpu_sibling_map[0]);
+		cpu_set(0, per_cpu(cpu_sibling_map, 0));
 		cpu_set(0, per_cpu(cpu_core_map, 0));
 		return;
 	}
@@ -1008,7 +1008,7 @@
 		printk(KERN_ERR "... forcing use of dummy APIC emulation. (tell your hw vendor)\n");
 		smpboot_clear_io_apic_irqs();
 		phys_cpu_present_map = physid_mask_of_physid(0);
-		cpu_set(0, cpu_sibling_map[0]);
+		cpu_set(0, per_cpu(cpu_sibling_map, 0));
 		cpu_set(0, per_cpu(cpu_core_map, 0));
 		return;
 	}
@@ -1023,7 +1023,7 @@
 		printk(KERN_INFO "SMP mode deactivated, forcing use of dummy APIC emulation.\n");
 		smpboot_clear_io_apic_irqs();
 		phys_cpu_present_map = physid_mask_of_physid(0);
-		cpu_set(0, cpu_sibling_map[0]);
+		cpu_set(0, per_cpu(cpu_sibling_map, 0));
 		cpu_set(0, per_cpu(cpu_core_map, 0));
 		return;
 	}
@@ -1102,15 +1102,15 @@
 	Dprintk("Boot done.\n");
 
 	/*
-	 * construct cpu_sibling_map[], so that we can tell sibling CPUs
+	 * construct cpu_sibling_map, so that we can tell sibling CPUs
 	 * efficiently.
 	 */
 	for (cpu = 0; cpu < NR_CPUS; cpu++) {
-		cpus_clear(cpu_sibling_map[cpu]);
+		cpus_clear(per_cpu(cpu_sibling_map, cpu));
 		cpus_clear(per_cpu(cpu_core_map, cpu));
 	}
 
-	cpu_set(0, cpu_sibling_map[0]);
+	cpu_set(0, per_cpu(cpu_sibling_map, 0));
 	cpu_set(0, per_cpu(cpu_core_map, 0));
 
 	smpboot_setup_io_apic();
@@ -1153,13 +1153,13 @@
 		/*/
 		 * last thread sibling in this cpu core going down
 		 */
-		if (cpus_weight(cpu_sibling_map[cpu]) == 1)
+		if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1)
 			c[sibling].booted_cores--;
 	}
 			
-	for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
-		cpu_clear(cpu, cpu_sibling_map[sibling]);
-	cpus_clear(cpu_sibling_map[cpu]);
+	for_each_cpu_mask(sibling, per_cpu(cpu_sibling_map, cpu))
+		cpu_clear(cpu, per_cpu(cpu_sibling_map, sibling));
+	cpus_clear(per_cpu(cpu_sibling_map, cpu));
 	cpus_clear(per_cpu(cpu_core_map, cpu));
 	c[cpu].phys_proc_id = 0;
 	c[cpu].cpu_core_id = 0;
--- a/arch/i386/oprofile/op_model_p4.c
+++ b/arch/i386/oprofile/op_model_p4.c
@@ -379,7 +379,7 @@
 {
 #ifdef CONFIG_SMP
 	int cpu = smp_processor_id();
-	return (cpu != first_cpu(cpu_sibling_map[cpu]));
+	return (cpu != first_cpu(per_cpu(cpu_sibling_map, cpu)));
 #endif	
 	return 0;
 }
--- a/arch/i386/xen/smp.c
+++ b/arch/i386/xen/smp.c
@@ -147,7 +147,7 @@
 	make_lowmem_page_readwrite(&per_cpu__gdt_page);
 
 	for (cpu = 0; cpu < NR_CPUS; cpu++) {
-		cpus_clear(cpu_sibling_map[cpu]);
+		cpus_clear(per_cpu(cpu_sibling_map, cpu));
 		/*
 		 * cpu_core_map lives in a per cpu area that is cleared
 		 * when the per cpu array is allocated.
@@ -164,7 +164,7 @@
 	unsigned cpu;
 
 	for (cpu = 0; cpu < NR_CPUS; cpu++) {
-		cpus_clear(cpu_sibling_map[cpu]);
+		cpus_clear(per_cpu(cpu_sibling_map, cpu));
 		/*
 		 * cpu_core_ map will be zeroed when the per
 		 * cpu area is allocated.
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -91,8 +91,8 @@
 int smp_threads_ready;
 
 /* representing HT siblings of each logical CPU */
-cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 
 /* representing HT and core siblings of each logical CPU */
 DEFINE_PER_CPU(cpumask_t, cpu_core_map);
@@ -264,8 +264,8 @@
 		for_each_cpu_mask(i, cpu_sibling_setup_map) {
 			if (c[cpu].phys_proc_id == c[i].phys_proc_id &&
 			    c[cpu].cpu_core_id == c[i].cpu_core_id) {
-				cpu_set(i, cpu_sibling_map[cpu]);
-				cpu_set(cpu, cpu_sibling_map[i]);
+				cpu_set(i, per_cpu(cpu_sibling_map, cpu));
+				cpu_set(cpu, per_cpu(cpu_sibling_map, i));
 				cpu_set(i, per_cpu(cpu_core_map, cpu));
 				cpu_set(cpu, per_cpu(cpu_core_map, i));
 				cpu_set(i, c[cpu].llc_shared_map);
@@ -273,13 +273,13 @@
 			}
 		}
 	} else {
-		cpu_set(cpu, cpu_sibling_map[cpu]);
+		cpu_set(cpu, per_cpu(cpu_sibling_map, cpu));
 	}
 
 	cpu_set(cpu, c[cpu].llc_shared_map);
 
 	if (current_cpu_data.x86_max_cores == 1) {
-		per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
+		per_cpu(cpu_core_map, cpu) = per_cpu(cpu_sibling_map, cpu);
 		c[cpu].booted_cores = 1;
 		return;
 	}
@@ -296,12 +296,12 @@
 			/*
 			 *  Does this new cpu bringup a new core?
 			 */
-			if (cpus_weight(cpu_sibling_map[cpu]) == 1) {
+			if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1) {
 				/*
 				 * for each core in package, increment
 				 * the booted_cores for this new cpu
 				 */
-				if (first_cpu(cpu_sibling_map[i]) == i)
+				if (first_cpu(per_cpu(cpu_sibling_map, i)) == i)
 					c[cpu].booted_cores++;
 				/*
 				 * increment the core count for all
@@ -741,7 +741,7 @@
 		phys_cpu_present_map = physid_mask_of_physid(boot_cpu_id);
 	else
 		phys_cpu_present_map = physid_mask_of_physid(0);
-	cpu_set(0, cpu_sibling_map[0]);
+	cpu_set(0, per_cpu(cpu_sibling_map, 0));
 	cpu_set(0, per_cpu(cpu_core_map, 0));
 }
 
@@ -982,13 +982,13 @@
 		/*
 		 * last thread sibling in this cpu core going down
 		 */
-		if (cpus_weight(cpu_sibling_map[cpu]) == 1)
+		if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1)
 			c[sibling].booted_cores--;
 	}
 			
-	for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
-		cpu_clear(cpu, cpu_sibling_map[sibling]);
-	cpus_clear(cpu_sibling_map[cpu]);
+	for_each_cpu_mask(sibling, per_cpu(cpu_sibling_map, cpu))
+		cpu_clear(cpu, per_cpu(cpu_sibling_map, sibling));
+	cpus_clear(per_cpu(cpu_sibling_map, cpu));
 	cpus_clear(per_cpu(cpu_core_map, cpu));
 	c[cpu].phys_proc_id = 0;
 	c[cpu].cpu_core_id = 0;
--- a/block/blktrace.c
+++ b/block/blktrace.c
@@ -536,7 +536,7 @@
 	for_each_online_cpu(cpu) {
 		unsigned long long *cpu_off, *sibling_off;
 
-		for_each_cpu_mask(i, cpu_sibling_map[cpu]) {
+		for_each_cpu_mask(i, per_cpu(cpu_sibling_map, cpu)) {
 			if (i == cpu)
 				continue;
 
--- a/include/asm-i386/smp.h
+++ b/include/asm-i386/smp.h
@@ -30,7 +30,7 @@
 extern void smp_alloc_memory(void);
 extern int pic_mode;
 extern int smp_num_siblings;
-extern cpumask_t cpu_sibling_map[];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
 DECLARE_PER_CPU(cpumask_t, cpu_core_map);
 
 extern void (*mtrr_hook) (void);
--- a/include/asm-i386/topology.h
+++ b/include/asm-i386/topology.h
@@ -31,7 +31,7 @@
 #define topology_physical_package_id(cpu)	(cpu_data[cpu].phys_proc_id)
 #define topology_core_id(cpu)			(cpu_data[cpu].cpu_core_id)
 #define topology_core_siblings(cpu)		(per_cpu(cpu_core_map, cpu))
-#define topology_thread_siblings(cpu)		(cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu)		(per_cpu(cpu_sibling_map, cpu))
 #endif
 
 #ifdef CONFIG_NUMA
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -38,12 +38,14 @@
 extern int smp_num_siblings;
 extern void smp_send_reschedule(int cpu);
 
-extern cpumask_t cpu_sibling_map[NR_CPUS];
 /*
- * cpu_core_map lives in a per cpu area
+ * cpu_sibling_map and cpu_core_map now live
+ * in the per cpu area
  *
+ * extern cpumask_t cpu_sibling_map[NR_CPUS];
  * extern cpumask_t cpu_core_map[NR_CPUS];
  */
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
 DECLARE_PER_CPU(cpumask_t, cpu_core_map);
 
 extern u8 cpu_llc_id[NR_CPUS];
--- a/include/asm-x86_64/topology.h
+++ b/include/asm-x86_64/topology.h
@@ -59,7 +59,7 @@
 #define topology_physical_package_id(cpu)	(cpu_data[cpu].phys_proc_id)
 #define topology_core_id(cpu)			(cpu_data[cpu].cpu_core_id)
 #define topology_core_siblings(cpu)		(per_cpu(cpu_core_map, cpu))
-#define topology_thread_siblings(cpu)		(cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu)		(per_cpu(cpu_sibling_map, cpu))
 #define mc_capable()			(boot_cpu_data.x86_max_cores > 1)
 #define smt_capable() 			(smp_num_siblings > 1)
 #endif
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5854,7 +5854,7 @@
 			     struct sched_group **sg)
 {
 	int group;
-	cpumask_t mask = cpu_sibling_map[cpu];
+	cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
 	cpus_and(mask, mask, *cpu_map);
 	group = first_cpu(mask);
 	if (sg)
@@ -5883,7 +5883,7 @@
 	cpus_and(mask, mask, *cpu_map);
 	group = first_cpu(mask);
 #elif defined(CONFIG_SCHED_SMT)
-	cpumask_t mask = cpu_sibling_map[cpu];
+	cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
 	cpus_and(mask, mask, *cpu_map);
 	group = first_cpu(mask);
 #else
@@ -6118,7 +6118,7 @@
 		p = sd;
 		sd = &per_cpu(cpu_domains, i);
 		*sd = SD_SIBLING_INIT;
-		sd->span = cpu_sibling_map[i];
+		sd->span = per_cpu(cpu_sibling_map, i);
 		cpus_and(sd->span, sd->span, *cpu_map);
 		sd->parent = p;
 		p->child = sd;
@@ -6129,7 +6129,7 @@
 #ifdef CONFIG_SCHED_SMT
 	/* Set up CPU (sibling) groups */
 	for_each_cpu_mask(i, *cpu_map) {
-		cpumask_t this_sibling_map = cpu_sibling_map[i];
+		cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
 		cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
 		if (i != first_cpu(this_sibling_map))
 			continue;

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 05/10] x86: Convert x86_cpu_to_apicid to be a per cpu variable (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (3 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 04/10] x86: Convert cpu_sibling_map " travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 06/10] x86: Convert cpu_llc_id " travis
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

This patch converts the x86_cpu_to_apicid array to be a per
cpu variable.  This saves sizeof(apicid) * NR unused cpus.
Access is mostly from startup and CPU HOTPLUG functions.

MP_processor_info() is one of the functions that require access
to the x86_cpu_to_apicid array before the per_cpu data area is
setup.  For this case, a pointer to the __initdata array is
initialized in setup_arch() and removed in smp_prepare_cpus()
after the per_cpu data area is initialized.

A second change is included to change the initial array value
of ARCH i386 from 0xff to BAD_APICID to be consistent with
ARCH x86_64.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/i386/kernel/acpi/boot.c      |    2 +-
 arch/i386/kernel/smp.c            |    2 +-
 arch/i386/kernel/smpboot.c        |   22 +++++++++++++++-------
 arch/x86_64/kernel/genapic.c      |   15 ++++++++++++---
 arch/x86_64/kernel/genapic_flat.c |    2 +-
 arch/x86_64/kernel/mpparse.c      |   15 +++++++++++++--
 arch/x86_64/kernel/setup.c        |    5 +++++
 arch/x86_64/kernel/smpboot.c      |   23 ++++++++++++++++++++++-
 arch/x86_64/mm/numa.c             |    2 +-
 include/asm-i386/smp.h            |    6 ++++--
 include/asm-x86_64/ipi.h          |    2 +-
 include/asm-x86_64/smp.h          |    6 ++++--
 12 files changed, 80 insertions(+), 22 deletions(-)

--- a/arch/i386/kernel/acpi/boot.c
+++ b/arch/i386/kernel/acpi/boot.c
@@ -555,7 +555,7 @@
 
 int acpi_unmap_lsapic(int cpu)
 {
-	x86_cpu_to_apicid[cpu] = -1;
+	per_cpu(x86_cpu_to_apicid, cpu) = -1;
 	cpu_clear(cpu, cpu_present_map);
 	num_processors--;
 
--- a/arch/i386/kernel/smp.c
+++ b/arch/i386/kernel/smp.c
@@ -673,7 +673,7 @@
 	int i;
 
 	for (i = 0; i < NR_CPUS; i++) {
-		if (x86_cpu_to_apicid[i] == apic_id)
+		if (per_cpu(x86_cpu_to_apicid, i) == apic_id)
 			return i;
 	}
 	return -1;
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -92,9 +92,17 @@
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
 EXPORT_SYMBOL(cpu_data);
 
-u8 x86_cpu_to_apicid[NR_CPUS] __read_mostly =
-			{ [0 ... NR_CPUS-1] = 0xff };
-EXPORT_SYMBOL(x86_cpu_to_apicid);
+/*
+ * The following static array is used during kernel startup
+ * and the x86_cpu_to_apicid_ptr contains the address of the
+ * array during this time.  Is it zeroed when the per_cpu
+ * data area is removed.
+ */
+u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata =
+			{ [0 ... NR_CPUS-1] = BAD_APICID };
+void *x86_cpu_to_apicid_ptr;
+DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
+EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid);
 
 u8 apicid_2_node[MAX_APICID];
 
@@ -804,7 +812,7 @@
 
 	irq_ctx_init(cpu);
 
-	x86_cpu_to_apicid[cpu] = apicid;
+	per_cpu(x86_cpu_to_apicid, cpu) = apicid;
 	/*
 	 * This grunge runs the startup process for
 	 * the targeted processor.
@@ -866,7 +874,7 @@
 		cpu_clear(cpu, cpu_initialized); /* was set by cpu_init() */
 		cpucount--;
 	} else {
-		x86_cpu_to_apicid[cpu] = apicid;
+		per_cpu(x86_cpu_to_apicid, cpu) = apicid;
 		cpu_set(cpu, cpu_present_map);
 	}
 
@@ -915,7 +923,7 @@
 	struct warm_boot_cpu_info info;
 	int	apicid, ret;
 
-	apicid = x86_cpu_to_apicid[cpu];
+	apicid = per_cpu(x86_cpu_to_apicid, cpu);
 	if (apicid == BAD_APICID) {
 		ret = -ENODEV;
 		goto exit;
@@ -965,7 +973,7 @@
 
 	boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
 	boot_cpu_logical_apicid = logical_smp_processor_id();
-	x86_cpu_to_apicid[0] = boot_cpu_physical_apicid;
+	per_cpu(x86_cpu_to_apicid, 0) = boot_cpu_physical_apicid;
 
 	current_thread_info()->cpu = 0;
 
--- a/arch/x86_64/kernel/mpparse.c
+++ b/arch/x86_64/kernel/mpparse.c
@@ -86,7 +86,7 @@
 	return sum & 0xFF;
 }
 
-static void __cpuinit MP_processor_info (struct mpc_config_processor *m)
+static void __cpuinit MP_processor_info(struct mpc_config_processor *m)
 {
 	int cpu;
 	cpumask_t tmp_map;
@@ -123,7 +123,18 @@
 		cpu = 0;
  	}
 	bios_cpu_apicid[cpu] = m->mpc_apicid;
-	x86_cpu_to_apicid[cpu] = m->mpc_apicid;
+	/*
+	 * We get called early in the the start_kernel initialization
+	 * process when the per_cpu data area is not yet setup, so we
+	 * use a static array that is removed after the per_cpu data
+	 * area is created.
+	 */
+	if (x86_cpu_to_apicid_ptr) {
+		u8 *x86_cpu_to_apicid = (u8 *)x86_cpu_to_apicid_ptr;
+		x86_cpu_to_apicid[cpu] = m->mpc_apicid;
+	} else {
+		per_cpu(x86_cpu_to_apicid, cpu) = m->mpc_apicid;
+	}
 
 	cpu_set(cpu, cpu_possible_map);
 	cpu_set(cpu, cpu_present_map);
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -701,7 +701,7 @@
 		clear_node_cpumask(cpu); /* was set by numa_add_cpu */
 		cpu_clear(cpu, cpu_present_map);
 		cpu_clear(cpu, cpu_possible_map);
-		x86_cpu_to_apicid[cpu] = BAD_APICID;
+		per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
 		return -EIO;
 	}
 
@@ -848,6 +848,26 @@
 }
 
 /*
+ * Copy apicid's found by MP_processor_info from initial array to the per cpu
+ * data area.  The x86_cpu_to_apicid_init array is then expendable and the
+ * x86_cpu_to_apicid_ptr is zeroed indicating that the static array is no
+ * longer available.
+ */
+void __init smp_set_apicids(void)
+{
+	int cpu;
+
+	for_each_cpu_mask(cpu, cpu_possible_map) {
+		if (per_cpu_offset(cpu))
+			per_cpu(x86_cpu_to_apicid, cpu) =
+						x86_cpu_to_apicid_init[cpu];
+	}
+
+	/* indicate the static array will be going away soon */
+	x86_cpu_to_apicid_ptr = NULL;
+}
+
+/*
  * Prepare for SMP bootup.  The MP table or ACPI has been read
  * earlier.  Just do some sanity checking here and enable APIC mode.
  */
@@ -856,6 +876,7 @@
 	nmi_watchdog_default();
 	current_cpu_data = boot_cpu_data;
 	current_thread_info()->cpu = 0;  /* needed? */
+	smp_set_apicids();
 	set_cpu_sibling_map(0);
 
 	if (smp_sanity_check(max_cpus) < 0) {
--- a/arch/x86_64/mm/numa.c
+++ b/arch/x86_64/mm/numa.c
@@ -612,7 +612,7 @@
 {
 	int i;
  	for (i = 0; i < NR_CPUS; i++) {
-		u8 apicid = x86_cpu_to_apicid[i];
+		u8 apicid = x86_cpu_to_apicid_init[i];
 		if (apicid == BAD_APICID)
 			continue;
 		if (apicid_to_node[apicid] == NUMA_NO_NODE)
--- a/include/asm-i386/smp.h
+++ b/include/asm-i386/smp.h
@@ -39,9 +39,11 @@
 extern void unlock_ipi_call_lock(void);
 
 #define MAX_APICID 256
-extern u8 x86_cpu_to_apicid[];
+extern u8 __initdata x86_cpu_to_apicid_init[];
+extern void *x86_cpu_to_apicid_ptr;
+DECLARE_PER_CPU(u8, x86_cpu_to_apicid);
 
-#define cpu_physical_id(cpu)	x86_cpu_to_apicid[cpu]
+#define cpu_physical_id(cpu)	per_cpu(x86_cpu_to_apicid, cpu)
 
 extern void set_cpu_sibling_map(int cpu);
 
--- a/include/asm-x86_64/ipi.h
+++ b/include/asm-x86_64/ipi.h
@@ -119,7 +119,7 @@
 	 */
 	local_irq_save(flags);
 	for_each_cpu_mask(query_cpu, mask) {
-		__send_IPI_dest_field(x86_cpu_to_apicid[query_cpu],
+		__send_IPI_dest_field(per_cpu(x86_cpu_to_apicid, query_cpu),
 				      vector, APIC_DEST_PHYSICAL);
 	}
 	local_irq_restore(flags);
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -85,7 +85,9 @@
  * Some lowlevel functions might want to know about
  * the real APIC ID <-> CPU # mapping.
  */
-extern u8 x86_cpu_to_apicid[NR_CPUS];	/* physical ID */
+extern u8 __initdata x86_cpu_to_apicid_init[];
+extern void *x86_cpu_to_apicid_ptr;
+DECLARE_PER_CPU(u8, x86_cpu_to_apicid);	/* physical ID */
 extern u8 bios_cpu_apicid[];
 
 static inline int cpu_present_to_apicid(int mps_cpu)
@@ -116,7 +118,7 @@
 }
 
 #ifdef CONFIG_SMP
-#define cpu_physical_id(cpu)		x86_cpu_to_apicid[cpu]
+#define cpu_physical_id(cpu)		per_cpu(x86_cpu_to_apicid, cpu)
 #else
 #define cpu_physical_id(cpu)		boot_cpu_id
 #endif /* !CONFIG_SMP */
--- a/arch/x86_64/kernel/genapic_flat.c
+++ b/arch/x86_64/kernel/genapic_flat.c
@@ -172,7 +172,7 @@
 	 */
 	cpu = first_cpu(cpumask);
 	if ((unsigned)cpu < NR_CPUS)
-		return x86_cpu_to_apicid[cpu];
+		return per_cpu(x86_cpu_to_apicid, cpu);
 	else
 		return BAD_APICID;
 }
--- a/arch/x86_64/kernel/genapic.c
+++ b/arch/x86_64/kernel/genapic.c
@@ -24,10 +24,19 @@
 #include <acpi/acpi_bus.h>
 #endif
 
-/* which logical CPU number maps to which CPU (physical APIC ID) */
-u8 x86_cpu_to_apicid[NR_CPUS] __read_mostly
+/*
+ * which logical CPU number maps to which CPU (physical APIC ID)
+ *
+ * The following static array is used during kernel startup
+ * and the x86_cpu_to_apicid_ptr contains the address of the
+ * array during this time.  Is it zeroed when the per_cpu
+ * data area is removed.
+ */
+u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata
 					= { [0 ... NR_CPUS-1] = BAD_APICID };
-EXPORT_SYMBOL(x86_cpu_to_apicid);
+void *x86_cpu_to_apicid_ptr;
+DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
+EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid);
 
 struct genapic __read_mostly *genapic = &apic_flat;
 
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -276,6 +276,11 @@
 
 	dmi_scan_machine();
 
+#ifdef CONFIG_SMP
+	/* setup to use the static apicid table during kernel startup */
+	x86_cpu_to_apicid_ptr = (void *)&x86_cpu_to_apicid_init;
+#endif
+
 #ifdef CONFIG_ACPI
 	/*
 	 * Initialize the ACPI boot-time table parser (gets the RSDP and SDT).

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 06/10] x86: Convert cpu_llc_id to be a per cpu variable (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (4 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 05/10] x86: Convert x86_cpu_to_apicid " travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3) travis
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

Convert cpu_llc_id from a static array sized by NR_CPUS to a
per_cpu variable.  This saves sizeof(cpu_llc_id) * NR unused
cpus.  Access is mostly from startup and CPU HOTPLUG functions.

Note there's an addtional change of the type of cpu_llc_id
from int to u8 for ARCH i386 to correspond with the same
type in ARCH x86_64.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/i386/kernel/cpu/intel_cacheinfo.c |    4 ++--
 arch/i386/kernel/smpboot.c             |    6 +++---
 arch/x86_64/kernel/smpboot.c           |    6 +++---
 include/asm-i386/processor.h           |    6 +++++-
 include/asm-x86_64/smp.h               |    9 ++++-----
 5 files changed, 17 insertions(+), 14 deletions(-)

--- a/arch/i386/kernel/cpu/intel_cacheinfo.c
+++ b/arch/i386/kernel/cpu/intel_cacheinfo.c
@@ -417,14 +417,14 @@
 	if (new_l2) {
 		l2 = new_l2;
 #ifdef CONFIG_X86_HT
-		cpu_llc_id[cpu] = l2_id;
+		per_cpu(cpu_llc_id, cpu) = l2_id;
 #endif
 	}
 
 	if (new_l3) {
 		l3 = new_l3;
 #ifdef CONFIG_X86_HT
-		cpu_llc_id[cpu] = l3_id;
+		per_cpu(cpu_llc_id, cpu) = l3_id;
 #endif
 	}
 
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -67,7 +67,7 @@
 EXPORT_SYMBOL(smp_num_siblings);
 
 /* Last level cache ID of each logical CPU */
-int cpu_llc_id[NR_CPUS] __cpuinitdata = {[0 ... NR_CPUS-1] = BAD_APICID};
+DEFINE_PER_CPU(u8, cpu_llc_id) = BAD_APICID;
 
 /* representing HT siblings of each logical CPU */
 DEFINE_PER_CPU(cpumask_t, cpu_sibling_map);
@@ -348,8 +348,8 @@
 	}
 
 	for_each_cpu_mask(i, cpu_sibling_setup_map) {
-		if (cpu_llc_id[cpu] != BAD_APICID &&
-		    cpu_llc_id[cpu] == cpu_llc_id[i]) {
+		if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
+		    per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
 			cpu_set(i, c[cpu].llc_shared_map);
 			cpu_set(cpu, c[i].llc_shared_map);
 		}
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -65,7 +65,7 @@
 EXPORT_SYMBOL(smp_num_siblings);
 
 /* Last level cache ID of each logical CPU */
-u8 cpu_llc_id[NR_CPUS] __cpuinitdata  = {[0 ... NR_CPUS-1] = BAD_APICID};
+DEFINE_PER_CPU(u8, cpu_llc_id) = BAD_APICID;
 
 /* Bitmask of currently online CPUs */
 cpumask_t cpu_online_map __read_mostly;
@@ -285,8 +285,8 @@
 	}
 
 	for_each_cpu_mask(i, cpu_sibling_setup_map) {
-		if (cpu_llc_id[cpu] != BAD_APICID &&
-		    cpu_llc_id[cpu] == cpu_llc_id[i]) {
+		if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
+		    per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
 			cpu_set(i, c[cpu].llc_shared_map);
 			cpu_set(cpu, c[i].llc_shared_map);
 		}
--- a/include/asm-i386/processor.h
+++ b/include/asm-i386/processor.h
@@ -110,7 +110,11 @@
 #define current_cpu_data boot_cpu_data
 #endif
 
-extern	int cpu_llc_id[NR_CPUS];
+/*
+ * the following now lives in the per cpu area:
+ * extern	int cpu_llc_id[NR_CPUS];
+ */
+DECLARE_PER_CPU(u8, cpu_llc_id);
 extern char ignore_fpu_irq;
 
 void __init cpu_detect(struct cpuinfo_x86 *c);
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -39,16 +39,14 @@
 extern void smp_send_reschedule(int cpu);
 
 /*
- * cpu_sibling_map and cpu_core_map now live
- * in the per cpu area
- *
+ * the following now live in the per cpu area:
  * extern cpumask_t cpu_sibling_map[NR_CPUS];
  * extern cpumask_t cpu_core_map[NR_CPUS];
+ * extern u8 cpu_llc_id[NR_CPUS];
  */
 DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
 DECLARE_PER_CPU(cpumask_t, cpu_core_map);
-
-extern u8 cpu_llc_id[NR_CPUS];
+DECLARE_PER_CPU(u8, cpu_llc_id);
 
 #define SMP_TRAMPOLINE_BASE 0x6000
 
@@ -120,6 +118,7 @@
 #ifdef CONFIG_SMP
 #define cpu_physical_id(cpu)		per_cpu(x86_cpu_to_apicid, cpu)
 #else
+extern unsigned int boot_cpu_id;
 #define cpu_physical_id(cpu)		boot_cpu_id
 #endif /* !CONFIG_SMP */
 #endif

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (5 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 06/10] x86: Convert cpu_llc_id " travis
@ 2007-09-12  1:56 ` travis
  2007-09-12  1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

This is from an earlier message from Christoph Lameter:

    processor_core.c currently tries to determine the apicid by special casing
    for IA64 and x86. The desired information is readily available via

	    cpu_physical_id()

    on IA64, i386 and x86_64.

    Signed-off-by: Christoph Lameter <clameter@sgi.com>

Additionally, boot_cpu_id needed to be exported to fix compile errors in
dma code when !CONFIG_SMP.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86_64/kernel/mpparse.c  |    2 ++
 drivers/acpi/processor_core.c |    8 +-------
 2 files changed, 3 insertions(+), 7 deletions(-)

--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -419,12 +419,6 @@
 	return 0;
 }
 
-#ifdef CONFIG_IA64
-#define arch_cpu_to_apicid 	ia64_cpu_to_sapicid
-#else
-#define arch_cpu_to_apicid 	x86_cpu_to_apicid
-#endif
-
 static int map_madt_entry(u32 acpi_id)
 {
 	unsigned long madt_end, entry;
@@ -498,7 +492,7 @@
 		return apic_id;
 
 	for (i = 0; i < NR_CPUS; ++i) {
-		if (arch_cpu_to_apicid[i] == apic_id)
+		if (cpu_physical_id(i) == apic_id)
 			return i;
 	}
 	return -1;
--- a/arch/x86_64/kernel/mpparse.c
+++ b/arch/x86_64/kernel/mpparse.c
@@ -57,6 +57,8 @@
 
 /* Processor that is doing the boot up */
 unsigned int boot_cpu_id = -1U;
+EXPORT_SYMBOL(boot_cpu_id);
+
 /* Internal processor count */
 unsigned int num_processors __cpuinitdata = 0;
 

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (6 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3) travis
@ 2007-09-12  1:56 ` travis
  2007-09-28  9:49   ` Paul Jackson
  2007-09-12  1:56 ` [PATCH 09/10] ppc64: " travis
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

Convert cpu_sibling_map to a per_cpu cpumask_t array for the ia64
architecture.  This fixes build errors in block/blktrace.c and
kernel/sched.c when CONFIG_SCHED_SMT is defined.


There was one access to cpu_sibling_map before the per_cpu data
area was created, so that step was moved to after the per_cpu
area is setup.

Tested and verified on an A4700.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/ia64/kernel/setup.c    |    4 ----
 arch/ia64/kernel/smpboot.c  |   18 ++++++++++--------
 arch/ia64/mm/contig.c       |    6 ++++++
 include/asm-ia64/smp.h      |    2 +-
 include/asm-ia64/topology.h |    2 +-
 5 files changed, 18 insertions(+), 14 deletions(-)

--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -528,10 +528,6 @@
 
 #ifdef CONFIG_SMP
 	cpu_physical_id(0) = hard_smp_processor_id();
-
-	cpu_set(0, cpu_sibling_map[0]);
-	cpu_set(0, cpu_core_map[0]);
-
 	check_for_logical_procs();
 	if (smp_num_cpucores > 1)
 		printk(KERN_INFO
--- a/arch/ia64/kernel/smpboot.c
+++ b/arch/ia64/kernel/smpboot.c
@@ -138,7 +138,9 @@
 EXPORT_SYMBOL(cpu_possible_map);
 
 cpumask_t cpu_core_map[NR_CPUS] __cacheline_aligned;
-cpumask_t cpu_sibling_map[NR_CPUS] __cacheline_aligned;
+DEFINE_PER_CPU_SHARED_ALIGNED(cpumask_t, cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
+
 int smp_num_siblings = 1;
 int smp_num_cpucores = 1;
 
@@ -650,12 +652,12 @@
 {
 	int i;
 
-	for_each_cpu_mask(i, cpu_sibling_map[cpu])
-		cpu_clear(cpu, cpu_sibling_map[i]);
+	for_each_cpu_mask(i, per_cpu(cpu_sibling_map, cpu))
+		cpu_clear(cpu, per_cpu(cpu_sibling_map, i));
 	for_each_cpu_mask(i, cpu_core_map[cpu])
 		cpu_clear(cpu, cpu_core_map[i]);
 
-	cpu_sibling_map[cpu] = cpu_core_map[cpu] = CPU_MASK_NONE;
+	per_cpu(cpu_sibling_map, cpu) = cpu_core_map[cpu] = CPU_MASK_NONE;
 }
 
 static void
@@ -666,7 +668,7 @@
 	if (cpu_data(cpu)->threads_per_core == 1 &&
 	    cpu_data(cpu)->cores_per_socket == 1) {
 		cpu_clear(cpu, cpu_core_map[cpu]);
-		cpu_clear(cpu, cpu_sibling_map[cpu]);
+		cpu_clear(cpu, per_cpu(cpu_sibling_map, cpu));
 		return;
 	}
 
@@ -807,8 +809,8 @@
 			cpu_set(i, cpu_core_map[cpu]);
 			cpu_set(cpu, cpu_core_map[i]);
 			if (cpu_data(cpu)->core_id == cpu_data(i)->core_id) {
-				cpu_set(i, cpu_sibling_map[cpu]);
-				cpu_set(cpu, cpu_sibling_map[i]);
+				cpu_set(i, per_cpu(cpu_sibling_map, cpu));
+				cpu_set(cpu, per_cpu(cpu_sibling_map, i));
 			}
 		}
 	}
@@ -839,7 +841,7 @@
 
 	if (cpu_data(cpu)->threads_per_core == 1 &&
 	    cpu_data(cpu)->cores_per_socket == 1) {
-		cpu_set(cpu, cpu_sibling_map[cpu]);
+		cpu_set(cpu, per_cpu(cpu_sibling_map, cpu));
 		cpu_set(cpu, cpu_core_map[cpu]);
 		return 0;
 	}
--- a/include/asm-ia64/smp.h
+++ b/include/asm-ia64/smp.h
@@ -58,7 +58,7 @@
 
 extern cpumask_t cpu_online_map;
 extern cpumask_t cpu_core_map[NR_CPUS];
-extern cpumask_t cpu_sibling_map[NR_CPUS];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
 extern int smp_num_siblings;
 extern int smp_num_cpucores;
 extern void __iomem *ipi_base_addr;
--- a/include/asm-ia64/topology.h
+++ b/include/asm-ia64/topology.h
@@ -112,7 +112,7 @@
 #define topology_physical_package_id(cpu)	(cpu_data(cpu)->socket_id)
 #define topology_core_id(cpu)			(cpu_data(cpu)->core_id)
 #define topology_core_siblings(cpu)		(cpu_core_map[cpu])
-#define topology_thread_siblings(cpu)		(cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu)		(per_cpu(cpu_sibling_map, cpu))
 #define smt_capable() 				(smp_num_siblings > 1)
 #endif
 
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -212,6 +212,12 @@
 			cpu_data += PERCPU_PAGE_SIZE;
 			per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
 		}
+		/*
+		 * cpu_sibling_map is now a per_cpu variable - it needs to
+		 * be accessed after per_cpu_init() sets up the per_cpu area.
+		 */
+		cpu_set(0, per_cpu(cpu_sibling_map, 0));
+		cpu_set(0, cpu_core_map[0]);
 	}
 	return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
 }

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (7 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
@ 2007-09-12  1:56 ` travis
  2007-09-17  6:28   ` Stephen Rothwell
  2007-09-12  1:56 ` [PATCH 10/10] sparc64: " travis
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

Convert cpu_sibling_map to a per_cpu cpumask_t array for the ppc64
architecture.  This fixes build errors in block/blktrace.c and
kernel/sched.c when CONFIG_SCHED_SMT is defined.

Note: these changes have not been built nor tested.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/powerpc/kernel/setup-common.c        |    4 ++--
 arch/powerpc/kernel/smp.c                 |    4 ++--
 arch/powerpc/platforms/cell/cbe_cpufreq.c |    2 +-
 include/asm-powerpc/smp.h                 |    4 +++-
 include/asm-powerpc/topology.h            |    2 +-
 5 files changed, 9 insertions(+), 7 deletions(-)

--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -415,9 +415,9 @@
 	 * Do the sibling map; assume only two threads per processor.
 	 */
 	for_each_possible_cpu(cpu) {
-		cpu_set(cpu, cpu_sibling_map[cpu]);
+		cpu_set(cpu, cpu_sibling_map(cpu));
 		if (cpu_has_feature(CPU_FTR_SMT))
-			cpu_set(cpu ^ 0x1, cpu_sibling_map[cpu]);
+			cpu_set(cpu ^ 0x1, cpu_sibling_map(cpu));
 	}
 
 	vdso_data->processorCount = num_present_cpus();
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -61,11 +61,11 @@
 
 cpumask_t cpu_possible_map = CPU_MASK_NONE;
 cpumask_t cpu_online_map = CPU_MASK_NONE;
-cpumask_t cpu_sibling_map[NR_CPUS] = { [0 ... NR_CPUS-1] = CPU_MASK_NONE };
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map) = CPU_MASK_NONE;
 
 EXPORT_SYMBOL(cpu_online_map);
 EXPORT_SYMBOL(cpu_possible_map);
-EXPORT_SYMBOL(cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 
 /* SMP operations for this machine */
 struct smp_ops_t *smp_ops;
--- a/arch/powerpc/platforms/cell/cbe_cpufreq.c
+++ b/arch/powerpc/platforms/cell/cbe_cpufreq.c
@@ -119,7 +119,7 @@
 	policy->cur = cbe_freqs[cur_pmode].frequency;
 
 #ifdef CONFIG_SMP
-	policy->cpus = cpu_sibling_map[policy->cpu];
+	policy->cpus = cpu_sibling_map(policy->cpu);
 #endif
 
 	cpufreq_frequency_table_get_attr(cbe_freqs, policy->cpu);
--- a/include/asm-powerpc/smp.h
+++ b/include/asm-powerpc/smp.h
@@ -25,6 +25,7 @@
 
 #ifdef CONFIG_PPC64
 #include <asm/paca.h>
+#include <asm/percpu.h>
 #endif
 
 extern int boot_cpuid;
@@ -58,7 +59,8 @@
 					(smp_hw_index[(cpu)] = (phys))
 #endif
 
-extern cpumask_t cpu_sibling_map[NR_CPUS];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
+#define cpu_sibling_map(cpu) per_cpu(cpu_sibling_map, cpu)
 
 /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
  *
--- a/include/asm-powerpc/topology.h
+++ b/include/asm-powerpc/topology.h
@@ -108,7 +108,7 @@
 #ifdef CONFIG_PPC64
 #include <asm/smp.h>
 
-#define topology_thread_siblings(cpu)	(cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu)	(cpu_sibling_map(cpu))
 #endif
 #endif
 

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 10/10] sparc64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (8 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 09/10] ppc64: " travis
@ 2007-09-12  1:56 ` travis
  2007-09-13  9:53 ` [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) Andi Kleen
  2007-09-14 23:32 ` Andrew Morton
  11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

Convert cpu_sibling_map to a per_cpu cpumask_t array for the sparc64
architecture.  This fixes build errors in block/blktrace.c and
kernel/sched.c when CONFIG_SCHED_SMT is defined.

Note: these changes have not been built nor tested.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/sparc64/kernel/smp.c      |   17 ++++++++---------
 include/asm-sparc64/smp.h      |    3 ++-
 include/asm-sparc64/topology.h |    2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

--- a/arch/sparc64/kernel/smp.c
+++ b/arch/sparc64/kernel/smp.c
@@ -52,14 +52,13 @@
 
 cpumask_t cpu_possible_map __read_mostly = CPU_MASK_NONE;
 cpumask_t cpu_online_map __read_mostly = CPU_MASK_NONE;
-cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly =
-	{ [0 ... NR_CPUS-1] = CPU_MASK_NONE };
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map) = CPU_MASK_NONE;
 cpumask_t cpu_core_map[NR_CPUS] __read_mostly =
 	{ [0 ... NR_CPUS-1] = CPU_MASK_NONE };
 
 EXPORT_SYMBOL(cpu_possible_map);
 EXPORT_SYMBOL(cpu_online_map);
-EXPORT_SYMBOL(cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 EXPORT_SYMBOL(cpu_core_map);
 
 static cpumask_t smp_commenced_mask;
@@ -1259,16 +1258,16 @@
 	for_each_present_cpu(i) {
 		unsigned int j;
 
-		cpus_clear(cpu_sibling_map[i]);
+		cpus_clear(per_cpu(cpu_sibling_map, i));
 		if (cpu_data(i).proc_id == -1) {
-			cpu_set(i, cpu_sibling_map[i]);
+			cpu_set(i, per_cpu(cpu_sibling_map, i));
 			continue;
 		}
 
 		for_each_present_cpu(j) {
 			if (cpu_data(i).proc_id ==
 			    cpu_data(j).proc_id)
-				cpu_set(j, cpu_sibling_map[i]);
+				cpu_set(j, per_cpu(cpu_sibling_map, i));
 		}
 	}
 }
@@ -1340,9 +1339,9 @@
 		cpu_clear(cpu, cpu_core_map[i]);
 	cpus_clear(cpu_core_map[cpu]);
 
-	for_each_cpu_mask(i, cpu_sibling_map[cpu])
-		cpu_clear(cpu, cpu_sibling_map[i]);
-	cpus_clear(cpu_sibling_map[cpu]);
+	for_each_cpu_mask(i, per_cpu(cpu_sibling_map, cpu))
+		cpu_clear(cpu, per_cpu(cpu_sibling_map, i));
+	cpus_clear(per_cpu(cpu_sibling_map, cpu));
 
 	c = &cpu_data(cpu);
 
--- a/include/asm-sparc64/smp.h
+++ b/include/asm-sparc64/smp.h
@@ -28,8 +28,9 @@
  
 #include <asm/bitops.h>
 #include <asm/atomic.h>
+#include <asm/percpu.h>
 
-extern cpumask_t cpu_sibling_map[NR_CPUS];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
 extern cpumask_t cpu_core_map[NR_CPUS];
 extern int sparc64_multi_core;
 
--- a/include/asm-sparc64/topology.h
+++ b/include/asm-sparc64/topology.h
@@ -5,7 +5,7 @@
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).proc_id)
 #define topology_core_id(cpu)			(cpu_data(cpu).core_id)
 #define topology_core_siblings(cpu)		(cpu_core_map[cpu])
-#define topology_thread_siblings(cpu)		(cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu)		(per_cpu(cpu_sibling_map, cpu))
 #define mc_capable()				(sparc64_multi_core)
 #define smt_capable()				(sparc64_multi_core)
 #endif /* CONFIG_SMP */

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (9 preceding siblings ...)
  2007-09-12  1:56 ` [PATCH 10/10] sparc64: " travis
@ 2007-09-13  9:53 ` Andi Kleen
  2007-09-14 23:32 ` Andrew Morton
  11 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2007-09-13  9:53 UTC (permalink / raw)
  To: travis
  Cc: linux-mm, linux-kernel, linuxppc-dev, sparclinux, Andrew Morton,
	Christoph Lameter

On Wednesday 12 September 2007 03:56, travis@sgi.com wrote:
> Note:
>
> This patch consolidates all the previous patches regarding
> the conversion of static arrays sized by NR_CPUS into per_cpu
> data arrays and is referenced against 2.6.23-rc6 .


Looks good to me from the x86 side. I'll leave it to Andrew to
handle for now though because it touches too many files
outside x86.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3)
  2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
                   ` (10 preceding siblings ...)
  2007-09-13  9:53 ` [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) Andi Kleen
@ 2007-09-14 23:32 ` Andrew Morton
  11 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2007-09-14 23:32 UTC (permalink / raw)
  To: travis
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Christoph Lameter

On Tue, 11 Sep 2007 18:56:44 -0700
travis@sgi.com wrote:

> Changes for version v3:
> 
> cpu_sibling_map has been converted to a per_cpu data array to fix
> build errors on ia64, ppc64 and sparc64 to accomodate references in
> block/blktrace.c and kernel/sched.c when CONFIG_SCHED_SMT is defined.
> 
> Warning: ppc64 and sparc64 have not yet been built nor tested.

These patches all seem to be unaltered from what I had.

The first patch (x86: remove x86_cpu_to_log_apicid array) is in Andi's
tree as x86_64-mm-remove-x86_cpu_to_log_apicid.patch so I don't apply that.

The sparc64/ppc64/ia64 convert-cpu_sibling_map-to-a-per_cpu-data-array
patches need to be folded into the base patch so that we don't break the build
at any stage.

So what I ended up with was

x86-fix-cpu_to_node-references.patch
x86-convert-cpu_core_map-to-be-a-per-cpu-variable.patch
convert-cpu_sibling_map-to-be-a-per-cpu-variable.patch
convert-cpu_sibling_map-to-a-per_cpu-data-array-ia64.patch
convert-cpu_sibling_map-to-a-per_cpu-data-array-ppc64.patch
convert-cpu_sibling_map-to-a-per_cpu-data-array-sparc64.patch
x86-convert-x86_cpu_to_apicid-to-be-a-per-cpu-variable.patch
x86-convert-cpu_llc_id-to-be-a-per-cpu-variable.patch

where the four convert-cpu_sibling_map-to-* will be clumped into a
single diff.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-12  1:56 ` [PATCH 09/10] ppc64: " travis
@ 2007-09-17  6:28   ` Stephen Rothwell
  2007-09-17  6:39     ` Stephen Rothwell
  2007-09-17 15:22     ` Mike Travis
  0 siblings, 2 replies; 18+ messages in thread
From: Stephen Rothwell @ 2007-09-17  6:28 UTC (permalink / raw)
  To: travis
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Andrew Morton, Christoph Lameter

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]

On Tue, 11 Sep 2007 18:56:53 -0700 travis@sgi.com wrote:
>
> Convert cpu_sibling_map to a per_cpu cpumask_t array for the ppc64
> architecture.  This fixes build errors in block/blktrace.c and
> kernel/sched.c when CONFIG_SCHED_SMT is defined.
> 
> Note: these changes have not been built nor tested.

After applying all 10 patches, the ppc64_defconfig builds but:

	vmlinux is larger:

   text    data     bss     dec     hex filename
7705776 1756984  504624 9967384  981718 ppc64/vmlinux
7706228 1757120  504624 9967972  981964 trav.bld/vmlinux

	the topology (on my POWERPC5+ box) is not correct:

cpu0/topology/thread_siblings:0000000f
cpu1/topology/thread_siblings:0000000f
cpu2/topology/thread_siblings:0000000f
cpu3/topology/thread_siblings:0000000f

it used to be:

cpu0/topology/thread_siblings:00000003
cpu1/topology/thread_siblings:00000003
cpu2/topology/thread_siblings:0000000c
cpu3/topology/thread_siblings:0000000c

Similarly on my iSeries box, the topology is displayed as above
while it used to be:

cpu0/topology/thread_siblings:00000001
cpu1/topology/thread_siblings:00000002
cpu2/topology/thread_siblings:00000004
cpu3/topology/thread_siblings:00000008

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-17  6:28   ` Stephen Rothwell
@ 2007-09-17  6:39     ` Stephen Rothwell
  2007-09-17 15:22     ` Mike Travis
  1 sibling, 0 replies; 18+ messages in thread
From: Stephen Rothwell @ 2007-09-17  6:39 UTC (permalink / raw)
  To: travis
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Andrew Morton, Christoph Lameter

[-- Attachment #1: Type: text/plain, Size: 718 bytes --]

On Mon, 17 Sep 2007 16:28:31 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> 	the topology (on my POWERPC5+ box) is not correct:
> 
> cpu0/topology/thread_siblings:0000000f
> cpu1/topology/thread_siblings:0000000f
> cpu2/topology/thread_siblings:0000000f
> cpu3/topology/thread_siblings:0000000f
> 
> it used to be:
> 
> cpu0/topology/thread_siblings:00000003
> cpu1/topology/thread_siblings:00000003
> cpu2/topology/thread_siblings:0000000c
> cpu3/topology/thread_siblings:0000000c

This would be because we are setting up the cpu_sibling map before we
call setup_per_cpu_areas().

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-17  6:28   ` Stephen Rothwell
  2007-09-17  6:39     ` Stephen Rothwell
@ 2007-09-17 15:22     ` Mike Travis
  1 sibling, 0 replies; 18+ messages in thread
From: Mike Travis @ 2007-09-17 15:22 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
	Andrew Morton, Christoph Lameter

Stephen Rothwell wrote:
> On Tue, 11 Sep 2007 18:56:53 -0700 travis@sgi.com wrote:
>> Convert cpu_sibling_map to a per_cpu cpumask_t array for the ppc64
>> architecture.  This fixes build errors in block/blktrace.c and
>> kernel/sched.c when CONFIG_SCHED_SMT is defined.
>>
>> Note: these changes have not been built nor tested.
> 
> After applying all 10 patches, the ppc64_defconfig builds but:
> 
> 	vmlinux is larger:
> 
>    text    data     bss     dec     hex filename
> 7705776 1756984  504624 9967384  981718 ppc64/vmlinux
> 7706228 1757120  504624 9967972  981964 trav.bld/vmlinux
> 
> 	the topology (on my POWERPC5+ box) is not correct:
> 
> cpu0/topology/thread_siblings:0000000f
> cpu1/topology/thread_siblings:0000000f
> cpu2/topology/thread_siblings:0000000f
> cpu3/topology/thread_siblings:0000000f
> 
> it used to be:
> 
> cpu0/topology/thread_siblings:00000003
> cpu1/topology/thread_siblings:00000003
> cpu2/topology/thread_siblings:0000000c
> cpu3/topology/thread_siblings:0000000c
> 
> Similarly on my iSeries box, the topology is displayed as above
> while it used to be:
> 
> cpu0/topology/thread_siblings:00000001
> cpu1/topology/thread_siblings:00000002
> cpu2/topology/thread_siblings:00000004
> cpu3/topology/thread_siblings:00000008
> 

Thanks Stephen for the feedback.  It may be the same situation
that some of the other arch's encounter in that the per_cpu
area is being accessed before it's setup.  I'll investigate
that a bit more.

Mike

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-12  1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
@ 2007-09-28  9:49   ` Paul Jackson
  2007-10-03 19:22     ` Mike Travis
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Jackson @ 2007-09-28  9:49 UTC (permalink / raw)
  To: travis; +Cc: linux-mm, ak, linux-kernel, linuxppc-dev, sparclinux, akpm,
	clameter

Mike,

I think there is a bug either in this ia64 patch, or in the related
generic arch patch: Convert cpu_sibling_map to be a per cpu variable
(v3).

It dies early in boot on me, on the SGI internal 8 processor IA64
system that you and I know as 'margin'.  The death is a hard hang, due
to a corrupt stack, due to a bogus cpu index.

I haven't tracked it down all the way, but have gotten this far.  If I add
the following patch, I get a panic on the BUG_ON if I have these two patches
in 2.6.23-rc8-mm1, but it boots just fine if I don't have these two patches.

It seems that the "cpu_sibling_map[cpu]" cpumask_t is empty (all zero
bits) with your two patches applied, but has some non-zero bits
otherwise, which leads to 'group' being NR_CPUS instead of a useful CPU
number.  Unfortunately, I have no idea why the "cpu_sibling_map[cpu]"
cpumask_t is empty -- good luck on that part.

The patch that catches this bug earlier is this:

--- 2.6.23-rc8-mm1.orig/kernel/sched.c	2007-09-28 01:42:20.144561024 -0700
+++ 2.6.23-rc8-mm1/kernel/sched.c	2007-09-28 02:27:14.239075497 -0700
@@ -5905,6 +5905,7 @@ static int cpu_to_phys_group(int cpu, co
 #else
 	group = cpu;
 #endif
+	BUG_ON(group == NR_CPUS);
 	if (sg)
 		*sg = &per_cpu(sched_group_phys, group);
 	return group;


-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3)
  2007-09-28  9:49   ` Paul Jackson
@ 2007-10-03 19:22     ` Mike Travis
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Travis @ 2007-10-03 19:22 UTC (permalink / raw)
  To: Paul Jackson
  Cc: linux-mm, ak, linux-kernel, linuxppc-dev, sparclinux, akpm,
	clameter

Hi Paul,

I just now found this.  I'll take a look immediately.  I tried it
on a couple of systems but not margin. 

Thanks,
Mike

Paul Jackson wrote:
> Mike,
> 
> I think there is a bug either in this ia64 patch, or in the related
> generic arch patch: Convert cpu_sibling_map to be a per cpu variable
> (v3).
> 
> It dies early in boot on me, on the SGI internal 8 processor IA64
> system that you and I know as 'margin'.  The death is a hard hang, due
> to a corrupt stack, due to a bogus cpu index.
> 
> I haven't tracked it down all the way, but have gotten this far.  If I add
> the following patch, I get a panic on the BUG_ON if I have these two patches
> in 2.6.23-rc8-mm1, but it boots just fine if I don't have these two patches.
> 
> It seems that the "cpu_sibling_map[cpu]" cpumask_t is empty (all zero
> bits) with your two patches applied, but has some non-zero bits
> otherwise, which leads to 'group' being NR_CPUS instead of a useful CPU
> number.  Unfortunately, I have no idea why the "cpu_sibling_map[cpu]"
> cpumask_t is empty -- good luck on that part.
> 
> The patch that catches this bug earlier is this:
> 
> --- 2.6.23-rc8-mm1.orig/kernel/sched.c	2007-09-28 01:42:20.144561024 -0700
> +++ 2.6.23-rc8-mm1/kernel/sched.c	2007-09-28 02:27:14.239075497 -0700
> @@ -5905,6 +5905,7 @@ static int cpu_to_phys_group(int cpu, co
>  #else
>  	group = cpu;
>  #endif
> +	BUG_ON(group == NR_CPUS);
>  	if (sg)
>  		*sg = &per_cpu(sched_group_phys, group);
>  	return group;
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-10-03 19:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-12  1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
2007-09-12  1:56 ` [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3) travis
2007-09-12  1:56 ` [PATCH 02/10] x86: fix cpu_to_node references (v3) travis
2007-09-12  1:56 ` [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3) travis
2007-09-12  1:56 ` [PATCH 04/10] x86: Convert cpu_sibling_map " travis
2007-09-12  1:56 ` [PATCH 05/10] x86: Convert x86_cpu_to_apicid " travis
2007-09-12  1:56 ` [PATCH 06/10] x86: Convert cpu_llc_id " travis
2007-09-12  1:56 ` [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3) travis
2007-09-12  1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
2007-09-28  9:49   ` Paul Jackson
2007-10-03 19:22     ` Mike Travis
2007-09-12  1:56 ` [PATCH 09/10] ppc64: " travis
2007-09-17  6:28   ` Stephen Rothwell
2007-09-17  6:39     ` Stephen Rothwell
2007-09-17 15:22     ` Mike Travis
2007-09-12  1:56 ` [PATCH 10/10] sparc64: " travis
2007-09-13  9:53 ` [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) Andi Kleen
2007-09-14 23:32 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).