* [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 02/10] x86: fix cpu_to_node references (v3) travis
` (10 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
This is a copy of an older patch that is in rc3-mm1. It's needed
to allow the remaining patches to integrate correctly.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86_64/kernel/genapic.c | 2 --
arch/x86_64/kernel/genapic_flat.c | 1 -
arch/x86_64/kernel/smpboot.c | 1 -
include/asm-x86_64/smp.h | 1 -
4 files changed, 5 deletions(-)
--- a/arch/x86_64/kernel/genapic.c
+++ b/arch/x86_64/kernel/genapic.c
@@ -29,8 +29,6 @@
= { [0 ... NR_CPUS-1] = BAD_APICID };
EXPORT_SYMBOL(x86_cpu_to_apicid);
-u8 x86_cpu_to_log_apicid[NR_CPUS] = { [0 ... NR_CPUS-1] = BAD_APICID };
-
struct genapic __read_mostly *genapic = &apic_flat;
/*
--- a/arch/x86_64/kernel/genapic_flat.c
+++ b/arch/x86_64/kernel/genapic_flat.c
@@ -52,7 +52,6 @@
num = smp_processor_id();
id = 1UL << num;
- x86_cpu_to_log_apicid[num] = id;
apic_write(APIC_DFR, APIC_DFR_FLAT);
val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
val |= SET_APIC_LOGICAL_ID(id);
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -702,7 +702,6 @@
cpu_clear(cpu, cpu_present_map);
cpu_clear(cpu, cpu_possible_map);
x86_cpu_to_apicid[cpu] = BAD_APICID;
- x86_cpu_to_log_apicid[cpu] = BAD_APICID;
return -EIO;
}
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -78,7 +78,6 @@
* the real APIC ID <-> CPU # mapping.
*/
extern u8 x86_cpu_to_apicid[NR_CPUS]; /* physical ID */
-extern u8 x86_cpu_to_log_apicid[NR_CPUS];
extern u8 bios_cpu_apicid[];
static inline int cpu_present_to_apicid(int mps_cpu)
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 02/10] x86: fix cpu_to_node references (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
2007-09-12 1:56 ` [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3) travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3) travis
` (9 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
Fix four instances where cpu_to_node is referenced
by array instead of via the cpu_to_node macro. This
is preparation to moving it to the per_cpu data area.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86_64/kernel/vsyscall.c | 2 +-
arch/x86_64/mm/numa.c | 4 ++--
arch/x86_64/mm/srat.c | 4 ++--
3 files changed, 5 insertions(+), 5 deletions(-)
--- a/arch/x86_64/kernel/vsyscall.c
+++ b/arch/x86_64/kernel/vsyscall.c
@@ -291,7 +291,7 @@
unsigned long *d;
unsigned long node = 0;
#ifdef CONFIG_NUMA
- node = cpu_to_node[cpu];
+ node = cpu_to_node(cpu);
#endif
if (cpu_has(&cpu_data[cpu], X86_FEATURE_RDTSCP))
write_rdtscp_aux((node << 12) | cpu);
--- a/arch/x86_64/mm/numa.c
+++ b/arch/x86_64/mm/numa.c
@@ -261,7 +261,7 @@
We round robin the existing nodes. */
rr = first_node(node_online_map);
for (i = 0; i < NR_CPUS; i++) {
- if (cpu_to_node[i] != NUMA_NO_NODE)
+ if (cpu_to_node(i) != NUMA_NO_NODE)
continue;
numa_set_node(i, rr);
rr = next_node(rr, node_online_map);
@@ -543,7 +543,7 @@
void __cpuinit numa_set_node(int cpu, int node)
{
cpu_pda(cpu)->nodenumber = node;
- cpu_to_node[cpu] = node;
+ cpu_to_node(cpu) = node;
}
unsigned long __init numa_free_all_bootmem(void)
--- a/arch/x86_64/mm/srat.c
+++ b/arch/x86_64/mm/srat.c
@@ -431,9 +431,9 @@
setup_node_bootmem(i, nodes[i].start, nodes[i].end);
for (i = 0; i < NR_CPUS; i++) {
- if (cpu_to_node[i] == NUMA_NO_NODE)
+ if (cpu_to_node(i) == NUMA_NO_NODE)
continue;
- if (!node_isset(cpu_to_node[i], node_possible_map))
+ if (!node_isset(cpu_to_node(i), node_possible_map))
numa_set_node(i, NUMA_NO_NODE);
}
numa_init_array();
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
2007-09-12 1:56 ` [PATCH 01/10] x86: remove x86_cpu_to_log_apicid array (v3) travis
2007-09-12 1:56 ` [PATCH 02/10] x86: fix cpu_to_node references (v3) travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 04/10] x86: Convert cpu_sibling_map " travis
` (8 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
This is from an earlier message from 'Christoph Lameter':
cpu_core_map is currently an array defined using NR_CPUS. This means that
we overallocate since we will rarely really use maximum configured cpu.
If we put the cpu_core_map into the per cpu area then it will be allocated
for each processor as it comes online.
This means that the core map cannot be accessed until the per cpu area
has been allocated. Xen does a weird thing here looping over all processors
and zeroing the masks that are not yet allocated and that will be zeroed
when they are allocated. I commented the code out.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c | 2 -
arch/i386/kernel/cpu/cpufreq/powernow-k8.c | 10 ++++----
arch/i386/kernel/cpu/proc.c | 3 +-
arch/i386/kernel/smpboot.c | 34 ++++++++++++++--------------
arch/i386/xen/smp.c | 14 +++++++++--
arch/x86_64/kernel/mce_amd.c | 6 ++--
arch/x86_64/kernel/setup.c | 3 +-
arch/x86_64/kernel/smpboot.c | 24 +++++++++----------
include/asm-i386/smp.h | 2 -
include/asm-i386/topology.h | 2 -
include/asm-x86_64/smp.h | 8 +++++-
include/asm-x86_64/topology.h | 2 -
12 files changed, 64 insertions(+), 46 deletions(-)
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -39,7 +39,13 @@
extern void smp_send_reschedule(int cpu);
extern cpumask_t cpu_sibling_map[NR_CPUS];
-extern cpumask_t cpu_core_map[NR_CPUS];
+/*
+ * cpu_core_map lives in a per cpu area
+ *
+ * extern cpumask_t cpu_core_map[NR_CPUS];
+ */
+DECLARE_PER_CPU(cpumask_t, cpu_core_map);
+
extern u8 cpu_llc_id[NR_CPUS];
#define SMP_TRAMPOLINE_BASE 0x6000
--- a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -595,7 +595,7 @@
dmi_check_system(sw_any_bug_dmi_table);
if (bios_with_sw_any_bug && cpus_weight(policy->cpus) == 1) {
policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
- policy->cpus = cpu_core_map[cpu];
+ policy->cpus = per_cpu(cpu_core_map, cpu);
}
#endif
--- a/arch/i386/kernel/cpu/cpufreq/powernow-k8.c
+++ b/arch/i386/kernel/cpu/cpufreq/powernow-k8.c
@@ -57,7 +57,7 @@
static int cpu_family = CPU_OPTERON;
#ifndef CONFIG_SMP
-static cpumask_t cpu_core_map[1];
+DEFINE_PER_CPU(cpumask_t, cpu_core_map);
#endif
/* Return a frequency in MHz, given an input fid */
@@ -664,7 +664,7 @@
dprintk("cfid 0x%x, cvid 0x%x\n", data->currfid, data->currvid);
data->powernow_table = powernow_table;
- if (first_cpu(cpu_core_map[data->cpu]) == data->cpu)
+ if (first_cpu(per_cpu(cpu_core_map, data->cpu)) == data->cpu)
print_basics(data);
for (j = 0; j < data->numps; j++)
@@ -818,7 +818,7 @@
/* fill in data */
data->numps = data->acpi_data.state_count;
- if (first_cpu(cpu_core_map[data->cpu]) == data->cpu)
+ if (first_cpu(per_cpu(cpu_core_map, data->cpu)) == data->cpu)
print_basics(data);
powernow_k8_acpi_pst_values(data, 0);
@@ -1212,7 +1212,7 @@
if (cpu_family == CPU_HW_PSTATE)
pol->cpus = cpumask_of_cpu(pol->cpu);
else
- pol->cpus = cpu_core_map[pol->cpu];
+ pol->cpus = per_cpu(cpu_core_map, pol->cpu);
data->available_cores = &(pol->cpus);
/* Take a crude guess here.
@@ -1279,7 +1279,7 @@
cpumask_t oldmask = current->cpus_allowed;
unsigned int khz = 0;
- data = powernow_data[first_cpu(cpu_core_map[cpu])];
+ data = powernow_data[first_cpu(per_cpu(cpu_core_map, cpu))];
if (!data)
return -EINVAL;
--- a/arch/i386/kernel/cpu/proc.c
+++ b/arch/i386/kernel/cpu/proc.c
@@ -122,7 +122,8 @@
#ifdef CONFIG_X86_HT
if (c->x86_max_cores * smp_num_siblings > 1) {
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
- seq_printf(m, "siblings\t: %d\n", cpus_weight(cpu_core_map[n]));
+ seq_printf(m, "siblings\t: %d\n",
+ cpus_weight(per_cpu(cpu_core_map, n)));
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
}
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -74,8 +74,8 @@
EXPORT_SYMBOL(cpu_sibling_map);
/* representing HT and core siblings of each logical CPU */
-cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_core_map);
+DEFINE_PER_CPU(cpumask_t, cpu_core_map);
+EXPORT_PER_CPU_SYMBOL(cpu_core_map);
/* bitmap of online cpus */
cpumask_t cpu_online_map __read_mostly;
@@ -300,7 +300,7 @@
* And for power savings, we return cpu_core_map
*/
if (sched_mc_power_savings || sched_smt_power_savings)
- return cpu_core_map[cpu];
+ return per_cpu(cpu_core_map, cpu);
else
return c->llc_shared_map;
}
@@ -321,8 +321,8 @@
c[cpu].cpu_core_id == c[i].cpu_core_id) {
cpu_set(i, cpu_sibling_map[cpu]);
cpu_set(cpu, cpu_sibling_map[i]);
- cpu_set(i, cpu_core_map[cpu]);
- cpu_set(cpu, cpu_core_map[i]);
+ cpu_set(i, per_cpu(cpu_core_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_core_map, i));
cpu_set(i, c[cpu].llc_shared_map);
cpu_set(cpu, c[i].llc_shared_map);
}
@@ -334,7 +334,7 @@
cpu_set(cpu, c[cpu].llc_shared_map);
if (current_cpu_data.x86_max_cores == 1) {
- cpu_core_map[cpu] = cpu_sibling_map[cpu];
+ per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
c[cpu].booted_cores = 1;
return;
}
@@ -346,8 +346,8 @@
cpu_set(cpu, c[i].llc_shared_map);
}
if (c[cpu].phys_proc_id == c[i].phys_proc_id) {
- cpu_set(i, cpu_core_map[cpu]);
- cpu_set(cpu, cpu_core_map[i]);
+ cpu_set(i, per_cpu(cpu_core_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_core_map, i));
/*
* Does this new cpu bringup a new core?
*/
@@ -984,7 +984,7 @@
" Using dummy APIC emulation.\n");
map_cpu_to_logical_apicid();
cpu_set(0, cpu_sibling_map[0]);
- cpu_set(0, cpu_core_map[0]);
+ cpu_set(0, per_cpu(cpu_core_map, 0));
return;
}
@@ -1009,7 +1009,7 @@
smpboot_clear_io_apic_irqs();
phys_cpu_present_map = physid_mask_of_physid(0);
cpu_set(0, cpu_sibling_map[0]);
- cpu_set(0, cpu_core_map[0]);
+ cpu_set(0, per_cpu(cpu_core_map, 0));
return;
}
@@ -1024,7 +1024,7 @@
smpboot_clear_io_apic_irqs();
phys_cpu_present_map = physid_mask_of_physid(0);
cpu_set(0, cpu_sibling_map[0]);
- cpu_set(0, cpu_core_map[0]);
+ cpu_set(0, per_cpu(cpu_core_map, 0));
return;
}
@@ -1107,11 +1107,11 @@
*/
for (cpu = 0; cpu < NR_CPUS; cpu++) {
cpus_clear(cpu_sibling_map[cpu]);
- cpus_clear(cpu_core_map[cpu]);
+ cpus_clear(per_cpu(cpu_core_map, cpu));
}
cpu_set(0, cpu_sibling_map[0]);
- cpu_set(0, cpu_core_map[0]);
+ cpu_set(0, per_cpu(cpu_core_map, 0));
smpboot_setup_io_apic();
@@ -1148,9 +1148,9 @@
int sibling;
struct cpuinfo_x86 *c = cpu_data;
- for_each_cpu_mask(sibling, cpu_core_map[cpu]) {
- cpu_clear(cpu, cpu_core_map[sibling]);
- /*
+ for_each_cpu_mask(sibling, per_cpu(cpu_core_map, cpu)) {
+ cpu_clear(cpu, per_cpu(cpu_core_map, sibling));
+ /*/
* last thread sibling in this cpu core going down
*/
if (cpus_weight(cpu_sibling_map[cpu]) == 1)
@@ -1160,7 +1160,7 @@
for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
cpu_clear(cpu, cpu_sibling_map[sibling]);
cpus_clear(cpu_sibling_map[cpu]);
- cpus_clear(cpu_core_map[cpu]);
+ cpus_clear(per_cpu(cpu_core_map, cpu));
c[cpu].phys_proc_id = 0;
c[cpu].cpu_core_id = 0;
cpu_clear(cpu, cpu_sibling_setup_map);
--- a/arch/i386/xen/smp.c
+++ b/arch/i386/xen/smp.c
@@ -148,7 +148,12 @@
for (cpu = 0; cpu < NR_CPUS; cpu++) {
cpus_clear(cpu_sibling_map[cpu]);
- cpus_clear(cpu_core_map[cpu]);
+ /*
+ * cpu_core_map lives in a per cpu area that is cleared
+ * when the per cpu array is allocated.
+ *
+ * cpus_clear(per_cpu(cpu_core_map, cpu));
+ */
}
xen_setup_vcpu_info_placement();
@@ -160,7 +165,12 @@
for (cpu = 0; cpu < NR_CPUS; cpu++) {
cpus_clear(cpu_sibling_map[cpu]);
- cpus_clear(cpu_core_map[cpu]);
+ /*
+ * cpu_core_ map will be zeroed when the per
+ * cpu area is allocated.
+ *
+ * cpus_clear(per_cpu(cpu_core_map, cpu));
+ */
}
smp_store_cpu_info(0);
--- a/arch/x86_64/kernel/mce_amd.c
+++ b/arch/x86_64/kernel/mce_amd.c
@@ -472,7 +472,7 @@
#ifdef CONFIG_SMP
if (cpu_data[cpu].cpu_core_id && shared_bank[bank]) { /* symlink */
- i = first_cpu(cpu_core_map[cpu]);
+ i = first_cpu(per_cpu(cpu_core_map, cpu));
/* first core not up yet */
if (cpu_data[i].cpu_core_id)
@@ -492,7 +492,7 @@
if (err)
goto out;
- b->cpus = cpu_core_map[cpu];
+ b->cpus = per_cpu(cpu_core_map, cpu);
per_cpu(threshold_banks, cpu)[bank] = b;
goto out;
}
@@ -509,7 +509,7 @@
#ifndef CONFIG_SMP
b->cpus = CPU_MASK_ALL;
#else
- b->cpus = cpu_core_map[cpu];
+ b->cpus = per_cpu(cpu_core_map, cpu);
#endif
err = kobject_register(&b->kobj);
if (err)
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -1041,7 +1041,8 @@
if (smp_num_siblings * c->x86_max_cores > 1) {
int cpu = c - cpu_data;
seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
- seq_printf(m, "siblings\t: %d\n", cpus_weight(cpu_core_map[cpu]));
+ seq_printf(m, "siblings\t: %d\n",
+ cpus_weight(per_cpu(cpu_core_map, cpu)));
seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
}
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -95,8 +95,8 @@
EXPORT_SYMBOL(cpu_sibling_map);
/* representing HT and core siblings of each logical CPU */
-cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_core_map);
+DEFINE_PER_CPU(cpumask_t, cpu_core_map);
+EXPORT_PER_CPU_SYMBOL(cpu_core_map);
/*
* Trampoline 80x86 program as an array.
@@ -245,7 +245,7 @@
* And for power savings, we return cpu_core_map
*/
if (sched_mc_power_savings || sched_smt_power_savings)
- return cpu_core_map[cpu];
+ return per_cpu(cpu_core_map, cpu);
else
return c->llc_shared_map;
}
@@ -266,8 +266,8 @@
c[cpu].cpu_core_id == c[i].cpu_core_id) {
cpu_set(i, cpu_sibling_map[cpu]);
cpu_set(cpu, cpu_sibling_map[i]);
- cpu_set(i, cpu_core_map[cpu]);
- cpu_set(cpu, cpu_core_map[i]);
+ cpu_set(i, per_cpu(cpu_core_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_core_map, i));
cpu_set(i, c[cpu].llc_shared_map);
cpu_set(cpu, c[i].llc_shared_map);
}
@@ -279,7 +279,7 @@
cpu_set(cpu, c[cpu].llc_shared_map);
if (current_cpu_data.x86_max_cores == 1) {
- cpu_core_map[cpu] = cpu_sibling_map[cpu];
+ per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
c[cpu].booted_cores = 1;
return;
}
@@ -291,8 +291,8 @@
cpu_set(cpu, c[i].llc_shared_map);
}
if (c[cpu].phys_proc_id == c[i].phys_proc_id) {
- cpu_set(i, cpu_core_map[cpu]);
- cpu_set(cpu, cpu_core_map[i]);
+ cpu_set(i, per_cpu(cpu_core_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_core_map, i));
/*
* Does this new cpu bringup a new core?
*/
@@ -742,7 +742,7 @@
else
phys_cpu_present_map = physid_mask_of_physid(0);
cpu_set(0, cpu_sibling_map[0]);
- cpu_set(0, cpu_core_map[0]);
+ cpu_set(0, per_cpu(cpu_core_map, 0));
}
#ifdef CONFIG_HOTPLUG_CPU
@@ -977,8 +977,8 @@
int sibling;
struct cpuinfo_x86 *c = cpu_data;
- for_each_cpu_mask(sibling, cpu_core_map[cpu]) {
- cpu_clear(cpu, cpu_core_map[sibling]);
+ for_each_cpu_mask(sibling, per_cpu(cpu_core_map, cpu)) {
+ cpu_clear(cpu, per_cpu(cpu_core_map, sibling));
/*
* last thread sibling in this cpu core going down
*/
@@ -989,7 +989,7 @@
for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
cpu_clear(cpu, cpu_sibling_map[sibling]);
cpus_clear(cpu_sibling_map[cpu]);
- cpus_clear(cpu_core_map[cpu]);
+ cpus_clear(per_cpu(cpu_core_map, cpu));
c[cpu].phys_proc_id = 0;
c[cpu].cpu_core_id = 0;
cpu_clear(cpu, cpu_sibling_setup_map);
--- a/include/asm-i386/smp.h
+++ b/include/asm-i386/smp.h
@@ -31,7 +31,7 @@
extern int pic_mode;
extern int smp_num_siblings;
extern cpumask_t cpu_sibling_map[];
-extern cpumask_t cpu_core_map[];
+DECLARE_PER_CPU(cpumask_t, cpu_core_map);
extern void (*mtrr_hook) (void);
extern void zap_low_mappings (void);
--- a/include/asm-i386/topology.h
+++ b/include/asm-i386/topology.h
@@ -30,7 +30,7 @@
#ifdef CONFIG_X86_HT
#define topology_physical_package_id(cpu) (cpu_data[cpu].phys_proc_id)
#define topology_core_id(cpu) (cpu_data[cpu].cpu_core_id)
-#define topology_core_siblings(cpu) (cpu_core_map[cpu])
+#define topology_core_siblings(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
#endif
--- a/include/asm-x86_64/topology.h
+++ b/include/asm-x86_64/topology.h
@@ -58,7 +58,7 @@
#ifdef CONFIG_SMP
#define topology_physical_package_id(cpu) (cpu_data[cpu].phys_proc_id)
#define topology_core_id(cpu) (cpu_data[cpu].cpu_core_id)
-#define topology_core_siblings(cpu) (cpu_core_map[cpu])
+#define topology_core_siblings(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
#define mc_capable() (boot_cpu_data.x86_max_cores > 1)
#define smt_capable() (smp_num_siblings > 1)
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 04/10] x86: Convert cpu_sibling_map to be a per cpu variable (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (2 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 03/10] x86: Convert cpu_core_map to be a per cpu variable (v3) travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 05/10] x86: Convert x86_cpu_to_apicid " travis
` (7 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
Convert cpu_sibling_map from a static array sized by NR_CPUS to a
per_cpu variable. This saves sizeof(cpumask_t) * NR unused cpus.
Access is mostly from startup and CPU HOTPLUG functions.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/i386/kernel/cpu/cpufreq/p4-clockmod.c | 2 -
arch/i386/kernel/cpu/cpufreq/speedstep-ich.c | 2 -
arch/i386/kernel/io_apic.c | 4 +--
arch/i386/kernel/smpboot.c | 36 +++++++++++++--------------
arch/i386/oprofile/op_model_p4.c | 2 -
arch/i386/xen/smp.c | 4 +--
arch/x86_64/kernel/smpboot.c | 26 +++++++++----------
block/blktrace.c | 2 -
include/asm-i386/smp.h | 2 -
include/asm-i386/topology.h | 2 -
include/asm-x86_64/smp.h | 6 +++-
include/asm-x86_64/topology.h | 2 -
kernel/sched.c | 8 +++---
13 files changed, 50 insertions(+), 48 deletions(-)
--- a/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
+++ b/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
@@ -200,7 +200,7 @@
unsigned int i;
#ifdef CONFIG_SMP
- policy->cpus = cpu_sibling_map[policy->cpu];
+ policy->cpus = per_cpu(cpu_sibling_map, policy->cpu);
#endif
/* Errata workaround */
--- a/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c
+++ b/arch/i386/kernel/cpu/cpufreq/speedstep-ich.c
@@ -322,7 +322,7 @@
/* only run on CPU to be set, or on its sibling */
#ifdef CONFIG_SMP
- policy->cpus = cpu_sibling_map[policy->cpu];
+ policy->cpus = per_cpu(cpu_sibling_map, policy->cpu);
#endif
cpus_allowed = current->cpus_allowed;
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -378,7 +378,7 @@
#define IRQ_ALLOWED(cpu, allowed_mask) cpu_isset(cpu, allowed_mask)
-#define CPU_TO_PACKAGEINDEX(i) (first_cpu(cpu_sibling_map[i]))
+#define CPU_TO_PACKAGEINDEX(i) (first_cpu(per_cpu(cpu_sibling_map, i)))
static cpumask_t balance_irq_affinity[NR_IRQS] = {
[0 ... NR_IRQS-1] = CPU_MASK_ALL
@@ -598,7 +598,7 @@
* (A+B)/2 vs B
*/
load = CPU_IRQ(min_loaded) >> 1;
- for_each_cpu_mask(j, cpu_sibling_map[min_loaded]) {
+ for_each_cpu_mask(j, per_cpu(cpu_sibling_map, min_loaded)) {
if (load > CPU_IRQ(j)) {
/* This won't change cpu_sibling_map[min_loaded] */
load = CPU_IRQ(j);
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -70,8 +70,8 @@
int cpu_llc_id[NR_CPUS] __cpuinitdata = {[0 ... NR_CPUS-1] = BAD_APICID};
/* representing HT siblings of each logical CPU */
-cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
/* representing HT and core siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_t, cpu_core_map);
@@ -319,8 +319,8 @@
for_each_cpu_mask(i, cpu_sibling_setup_map) {
if (c[cpu].phys_proc_id == c[i].phys_proc_id &&
c[cpu].cpu_core_id == c[i].cpu_core_id) {
- cpu_set(i, cpu_sibling_map[cpu]);
- cpu_set(cpu, cpu_sibling_map[i]);
+ cpu_set(i, per_cpu(cpu_sibling_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_sibling_map, i));
cpu_set(i, per_cpu(cpu_core_map, cpu));
cpu_set(cpu, per_cpu(cpu_core_map, i));
cpu_set(i, c[cpu].llc_shared_map);
@@ -328,13 +328,13 @@
}
}
} else {
- cpu_set(cpu, cpu_sibling_map[cpu]);
+ cpu_set(cpu, per_cpu(cpu_sibling_map, cpu));
}
cpu_set(cpu, c[cpu].llc_shared_map);
if (current_cpu_data.x86_max_cores == 1) {
- per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
+ per_cpu(cpu_core_map, cpu) = per_cpu(cpu_sibling_map, cpu);
c[cpu].booted_cores = 1;
return;
}
@@ -351,12 +351,12 @@
/*
* Does this new cpu bringup a new core?
*/
- if (cpus_weight(cpu_sibling_map[cpu]) == 1) {
+ if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1) {
/*
* for each core in package, increment
* the booted_cores for this new cpu
*/
- if (first_cpu(cpu_sibling_map[i]) == i)
+ if (first_cpu(per_cpu(cpu_sibling_map, i)) == i)
c[cpu].booted_cores++;
/*
* increment the core count for all
@@ -983,7 +983,7 @@
printk(KERN_NOTICE "Local APIC not detected."
" Using dummy APIC emulation.\n");
map_cpu_to_logical_apicid();
- cpu_set(0, cpu_sibling_map[0]);
+ cpu_set(0, per_cpu(cpu_sibling_map, 0));
cpu_set(0, per_cpu(cpu_core_map, 0));
return;
}
@@ -1008,7 +1008,7 @@
printk(KERN_ERR "... forcing use of dummy APIC emulation. (tell your hw vendor)\n");
smpboot_clear_io_apic_irqs();
phys_cpu_present_map = physid_mask_of_physid(0);
- cpu_set(0, cpu_sibling_map[0]);
+ cpu_set(0, per_cpu(cpu_sibling_map, 0));
cpu_set(0, per_cpu(cpu_core_map, 0));
return;
}
@@ -1023,7 +1023,7 @@
printk(KERN_INFO "SMP mode deactivated, forcing use of dummy APIC emulation.\n");
smpboot_clear_io_apic_irqs();
phys_cpu_present_map = physid_mask_of_physid(0);
- cpu_set(0, cpu_sibling_map[0]);
+ cpu_set(0, per_cpu(cpu_sibling_map, 0));
cpu_set(0, per_cpu(cpu_core_map, 0));
return;
}
@@ -1102,15 +1102,15 @@
Dprintk("Boot done.\n");
/*
- * construct cpu_sibling_map[], so that we can tell sibling CPUs
+ * construct cpu_sibling_map, so that we can tell sibling CPUs
* efficiently.
*/
for (cpu = 0; cpu < NR_CPUS; cpu++) {
- cpus_clear(cpu_sibling_map[cpu]);
+ cpus_clear(per_cpu(cpu_sibling_map, cpu));
cpus_clear(per_cpu(cpu_core_map, cpu));
}
- cpu_set(0, cpu_sibling_map[0]);
+ cpu_set(0, per_cpu(cpu_sibling_map, 0));
cpu_set(0, per_cpu(cpu_core_map, 0));
smpboot_setup_io_apic();
@@ -1153,13 +1153,13 @@
/*/
* last thread sibling in this cpu core going down
*/
- if (cpus_weight(cpu_sibling_map[cpu]) == 1)
+ if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1)
c[sibling].booted_cores--;
}
- for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
- cpu_clear(cpu, cpu_sibling_map[sibling]);
- cpus_clear(cpu_sibling_map[cpu]);
+ for_each_cpu_mask(sibling, per_cpu(cpu_sibling_map, cpu))
+ cpu_clear(cpu, per_cpu(cpu_sibling_map, sibling));
+ cpus_clear(per_cpu(cpu_sibling_map, cpu));
cpus_clear(per_cpu(cpu_core_map, cpu));
c[cpu].phys_proc_id = 0;
c[cpu].cpu_core_id = 0;
--- a/arch/i386/oprofile/op_model_p4.c
+++ b/arch/i386/oprofile/op_model_p4.c
@@ -379,7 +379,7 @@
{
#ifdef CONFIG_SMP
int cpu = smp_processor_id();
- return (cpu != first_cpu(cpu_sibling_map[cpu]));
+ return (cpu != first_cpu(per_cpu(cpu_sibling_map, cpu)));
#endif
return 0;
}
--- a/arch/i386/xen/smp.c
+++ b/arch/i386/xen/smp.c
@@ -147,7 +147,7 @@
make_lowmem_page_readwrite(&per_cpu__gdt_page);
for (cpu = 0; cpu < NR_CPUS; cpu++) {
- cpus_clear(cpu_sibling_map[cpu]);
+ cpus_clear(per_cpu(cpu_sibling_map, cpu));
/*
* cpu_core_map lives in a per cpu area that is cleared
* when the per cpu array is allocated.
@@ -164,7 +164,7 @@
unsigned cpu;
for (cpu = 0; cpu < NR_CPUS; cpu++) {
- cpus_clear(cpu_sibling_map[cpu]);
+ cpus_clear(per_cpu(cpu_sibling_map, cpu));
/*
* cpu_core_ map will be zeroed when the per
* cpu area is allocated.
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -91,8 +91,8 @@
int smp_threads_ready;
/* representing HT siblings of each logical CPU */
-cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly;
-EXPORT_SYMBOL(cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
/* representing HT and core siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_t, cpu_core_map);
@@ -264,8 +264,8 @@
for_each_cpu_mask(i, cpu_sibling_setup_map) {
if (c[cpu].phys_proc_id == c[i].phys_proc_id &&
c[cpu].cpu_core_id == c[i].cpu_core_id) {
- cpu_set(i, cpu_sibling_map[cpu]);
- cpu_set(cpu, cpu_sibling_map[i]);
+ cpu_set(i, per_cpu(cpu_sibling_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_sibling_map, i));
cpu_set(i, per_cpu(cpu_core_map, cpu));
cpu_set(cpu, per_cpu(cpu_core_map, i));
cpu_set(i, c[cpu].llc_shared_map);
@@ -273,13 +273,13 @@
}
}
} else {
- cpu_set(cpu, cpu_sibling_map[cpu]);
+ cpu_set(cpu, per_cpu(cpu_sibling_map, cpu));
}
cpu_set(cpu, c[cpu].llc_shared_map);
if (current_cpu_data.x86_max_cores == 1) {
- per_cpu(cpu_core_map, cpu) = cpu_sibling_map[cpu];
+ per_cpu(cpu_core_map, cpu) = per_cpu(cpu_sibling_map, cpu);
c[cpu].booted_cores = 1;
return;
}
@@ -296,12 +296,12 @@
/*
* Does this new cpu bringup a new core?
*/
- if (cpus_weight(cpu_sibling_map[cpu]) == 1) {
+ if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1) {
/*
* for each core in package, increment
* the booted_cores for this new cpu
*/
- if (first_cpu(cpu_sibling_map[i]) == i)
+ if (first_cpu(per_cpu(cpu_sibling_map, i)) == i)
c[cpu].booted_cores++;
/*
* increment the core count for all
@@ -741,7 +741,7 @@
phys_cpu_present_map = physid_mask_of_physid(boot_cpu_id);
else
phys_cpu_present_map = physid_mask_of_physid(0);
- cpu_set(0, cpu_sibling_map[0]);
+ cpu_set(0, per_cpu(cpu_sibling_map, 0));
cpu_set(0, per_cpu(cpu_core_map, 0));
}
@@ -982,13 +982,13 @@
/*
* last thread sibling in this cpu core going down
*/
- if (cpus_weight(cpu_sibling_map[cpu]) == 1)
+ if (cpus_weight(per_cpu(cpu_sibling_map, cpu)) == 1)
c[sibling].booted_cores--;
}
- for_each_cpu_mask(sibling, cpu_sibling_map[cpu])
- cpu_clear(cpu, cpu_sibling_map[sibling]);
- cpus_clear(cpu_sibling_map[cpu]);
+ for_each_cpu_mask(sibling, per_cpu(cpu_sibling_map, cpu))
+ cpu_clear(cpu, per_cpu(cpu_sibling_map, sibling));
+ cpus_clear(per_cpu(cpu_sibling_map, cpu));
cpus_clear(per_cpu(cpu_core_map, cpu));
c[cpu].phys_proc_id = 0;
c[cpu].cpu_core_id = 0;
--- a/block/blktrace.c
+++ b/block/blktrace.c
@@ -536,7 +536,7 @@
for_each_online_cpu(cpu) {
unsigned long long *cpu_off, *sibling_off;
- for_each_cpu_mask(i, cpu_sibling_map[cpu]) {
+ for_each_cpu_mask(i, per_cpu(cpu_sibling_map, cpu)) {
if (i == cpu)
continue;
--- a/include/asm-i386/smp.h
+++ b/include/asm-i386/smp.h
@@ -30,7 +30,7 @@
extern void smp_alloc_memory(void);
extern int pic_mode;
extern int smp_num_siblings;
-extern cpumask_t cpu_sibling_map[];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_t, cpu_core_map);
extern void (*mtrr_hook) (void);
--- a/include/asm-i386/topology.h
+++ b/include/asm-i386/topology.h
@@ -31,7 +31,7 @@
#define topology_physical_package_id(cpu) (cpu_data[cpu].phys_proc_id)
#define topology_core_id(cpu) (cpu_data[cpu].cpu_core_id)
#define topology_core_siblings(cpu) (per_cpu(cpu_core_map, cpu))
-#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu) (per_cpu(cpu_sibling_map, cpu))
#endif
#ifdef CONFIG_NUMA
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -38,12 +38,14 @@
extern int smp_num_siblings;
extern void smp_send_reschedule(int cpu);
-extern cpumask_t cpu_sibling_map[NR_CPUS];
/*
- * cpu_core_map lives in a per cpu area
+ * cpu_sibling_map and cpu_core_map now live
+ * in the per cpu area
*
+ * extern cpumask_t cpu_sibling_map[NR_CPUS];
* extern cpumask_t cpu_core_map[NR_CPUS];
*/
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_t, cpu_core_map);
extern u8 cpu_llc_id[NR_CPUS];
--- a/include/asm-x86_64/topology.h
+++ b/include/asm-x86_64/topology.h
@@ -59,7 +59,7 @@
#define topology_physical_package_id(cpu) (cpu_data[cpu].phys_proc_id)
#define topology_core_id(cpu) (cpu_data[cpu].cpu_core_id)
#define topology_core_siblings(cpu) (per_cpu(cpu_core_map, cpu))
-#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu) (per_cpu(cpu_sibling_map, cpu))
#define mc_capable() (boot_cpu_data.x86_max_cores > 1)
#define smt_capable() (smp_num_siblings > 1)
#endif
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5854,7 +5854,7 @@
struct sched_group **sg)
{
int group;
- cpumask_t mask = cpu_sibling_map[cpu];
+ cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
cpus_and(mask, mask, *cpu_map);
group = first_cpu(mask);
if (sg)
@@ -5883,7 +5883,7 @@
cpus_and(mask, mask, *cpu_map);
group = first_cpu(mask);
#elif defined(CONFIG_SCHED_SMT)
- cpumask_t mask = cpu_sibling_map[cpu];
+ cpumask_t mask = per_cpu(cpu_sibling_map, cpu);
cpus_and(mask, mask, *cpu_map);
group = first_cpu(mask);
#else
@@ -6118,7 +6118,7 @@
p = sd;
sd = &per_cpu(cpu_domains, i);
*sd = SD_SIBLING_INIT;
- sd->span = cpu_sibling_map[i];
+ sd->span = per_cpu(cpu_sibling_map, i);
cpus_and(sd->span, sd->span, *cpu_map);
sd->parent = p;
p->child = sd;
@@ -6129,7 +6129,7 @@
#ifdef CONFIG_SCHED_SMT
/* Set up CPU (sibling) groups */
for_each_cpu_mask(i, *cpu_map) {
- cpumask_t this_sibling_map = cpu_sibling_map[i];
+ cpumask_t this_sibling_map = per_cpu(cpu_sibling_map, i);
cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
if (i != first_cpu(this_sibling_map))
continue;
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 05/10] x86: Convert x86_cpu_to_apicid to be a per cpu variable (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (3 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 04/10] x86: Convert cpu_sibling_map " travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 06/10] x86: Convert cpu_llc_id " travis
` (6 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
This patch converts the x86_cpu_to_apicid array to be a per
cpu variable. This saves sizeof(apicid) * NR unused cpus.
Access is mostly from startup and CPU HOTPLUG functions.
MP_processor_info() is one of the functions that require access
to the x86_cpu_to_apicid array before the per_cpu data area is
setup. For this case, a pointer to the __initdata array is
initialized in setup_arch() and removed in smp_prepare_cpus()
after the per_cpu data area is initialized.
A second change is included to change the initial array value
of ARCH i386 from 0xff to BAD_APICID to be consistent with
ARCH x86_64.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/i386/kernel/acpi/boot.c | 2 +-
arch/i386/kernel/smp.c | 2 +-
arch/i386/kernel/smpboot.c | 22 +++++++++++++++-------
arch/x86_64/kernel/genapic.c | 15 ++++++++++++---
arch/x86_64/kernel/genapic_flat.c | 2 +-
arch/x86_64/kernel/mpparse.c | 15 +++++++++++++--
arch/x86_64/kernel/setup.c | 5 +++++
arch/x86_64/kernel/smpboot.c | 23 ++++++++++++++++++++++-
arch/x86_64/mm/numa.c | 2 +-
include/asm-i386/smp.h | 6 ++++--
include/asm-x86_64/ipi.h | 2 +-
include/asm-x86_64/smp.h | 6 ++++--
12 files changed, 80 insertions(+), 22 deletions(-)
--- a/arch/i386/kernel/acpi/boot.c
+++ b/arch/i386/kernel/acpi/boot.c
@@ -555,7 +555,7 @@
int acpi_unmap_lsapic(int cpu)
{
- x86_cpu_to_apicid[cpu] = -1;
+ per_cpu(x86_cpu_to_apicid, cpu) = -1;
cpu_clear(cpu, cpu_present_map);
num_processors--;
--- a/arch/i386/kernel/smp.c
+++ b/arch/i386/kernel/smp.c
@@ -673,7 +673,7 @@
int i;
for (i = 0; i < NR_CPUS; i++) {
- if (x86_cpu_to_apicid[i] == apic_id)
+ if (per_cpu(x86_cpu_to_apicid, i) == apic_id)
return i;
}
return -1;
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -92,9 +92,17 @@
struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
EXPORT_SYMBOL(cpu_data);
-u8 x86_cpu_to_apicid[NR_CPUS] __read_mostly =
- { [0 ... NR_CPUS-1] = 0xff };
-EXPORT_SYMBOL(x86_cpu_to_apicid);
+/*
+ * The following static array is used during kernel startup
+ * and the x86_cpu_to_apicid_ptr contains the address of the
+ * array during this time. Is it zeroed when the per_cpu
+ * data area is removed.
+ */
+u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata =
+ { [0 ... NR_CPUS-1] = BAD_APICID };
+void *x86_cpu_to_apicid_ptr;
+DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
+EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid);
u8 apicid_2_node[MAX_APICID];
@@ -804,7 +812,7 @@
irq_ctx_init(cpu);
- x86_cpu_to_apicid[cpu] = apicid;
+ per_cpu(x86_cpu_to_apicid, cpu) = apicid;
/*
* This grunge runs the startup process for
* the targeted processor.
@@ -866,7 +874,7 @@
cpu_clear(cpu, cpu_initialized); /* was set by cpu_init() */
cpucount--;
} else {
- x86_cpu_to_apicid[cpu] = apicid;
+ per_cpu(x86_cpu_to_apicid, cpu) = apicid;
cpu_set(cpu, cpu_present_map);
}
@@ -915,7 +923,7 @@
struct warm_boot_cpu_info info;
int apicid, ret;
- apicid = x86_cpu_to_apicid[cpu];
+ apicid = per_cpu(x86_cpu_to_apicid, cpu);
if (apicid == BAD_APICID) {
ret = -ENODEV;
goto exit;
@@ -965,7 +973,7 @@
boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
boot_cpu_logical_apicid = logical_smp_processor_id();
- x86_cpu_to_apicid[0] = boot_cpu_physical_apicid;
+ per_cpu(x86_cpu_to_apicid, 0) = boot_cpu_physical_apicid;
current_thread_info()->cpu = 0;
--- a/arch/x86_64/kernel/mpparse.c
+++ b/arch/x86_64/kernel/mpparse.c
@@ -86,7 +86,7 @@
return sum & 0xFF;
}
-static void __cpuinit MP_processor_info (struct mpc_config_processor *m)
+static void __cpuinit MP_processor_info(struct mpc_config_processor *m)
{
int cpu;
cpumask_t tmp_map;
@@ -123,7 +123,18 @@
cpu = 0;
}
bios_cpu_apicid[cpu] = m->mpc_apicid;
- x86_cpu_to_apicid[cpu] = m->mpc_apicid;
+ /*
+ * We get called early in the the start_kernel initialization
+ * process when the per_cpu data area is not yet setup, so we
+ * use a static array that is removed after the per_cpu data
+ * area is created.
+ */
+ if (x86_cpu_to_apicid_ptr) {
+ u8 *x86_cpu_to_apicid = (u8 *)x86_cpu_to_apicid_ptr;
+ x86_cpu_to_apicid[cpu] = m->mpc_apicid;
+ } else {
+ per_cpu(x86_cpu_to_apicid, cpu) = m->mpc_apicid;
+ }
cpu_set(cpu, cpu_possible_map);
cpu_set(cpu, cpu_present_map);
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -701,7 +701,7 @@
clear_node_cpumask(cpu); /* was set by numa_add_cpu */
cpu_clear(cpu, cpu_present_map);
cpu_clear(cpu, cpu_possible_map);
- x86_cpu_to_apicid[cpu] = BAD_APICID;
+ per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
return -EIO;
}
@@ -848,6 +848,26 @@
}
/*
+ * Copy apicid's found by MP_processor_info from initial array to the per cpu
+ * data area. The x86_cpu_to_apicid_init array is then expendable and the
+ * x86_cpu_to_apicid_ptr is zeroed indicating that the static array is no
+ * longer available.
+ */
+void __init smp_set_apicids(void)
+{
+ int cpu;
+
+ for_each_cpu_mask(cpu, cpu_possible_map) {
+ if (per_cpu_offset(cpu))
+ per_cpu(x86_cpu_to_apicid, cpu) =
+ x86_cpu_to_apicid_init[cpu];
+ }
+
+ /* indicate the static array will be going away soon */
+ x86_cpu_to_apicid_ptr = NULL;
+}
+
+/*
* Prepare for SMP bootup. The MP table or ACPI has been read
* earlier. Just do some sanity checking here and enable APIC mode.
*/
@@ -856,6 +876,7 @@
nmi_watchdog_default();
current_cpu_data = boot_cpu_data;
current_thread_info()->cpu = 0; /* needed? */
+ smp_set_apicids();
set_cpu_sibling_map(0);
if (smp_sanity_check(max_cpus) < 0) {
--- a/arch/x86_64/mm/numa.c
+++ b/arch/x86_64/mm/numa.c
@@ -612,7 +612,7 @@
{
int i;
for (i = 0; i < NR_CPUS; i++) {
- u8 apicid = x86_cpu_to_apicid[i];
+ u8 apicid = x86_cpu_to_apicid_init[i];
if (apicid == BAD_APICID)
continue;
if (apicid_to_node[apicid] == NUMA_NO_NODE)
--- a/include/asm-i386/smp.h
+++ b/include/asm-i386/smp.h
@@ -39,9 +39,11 @@
extern void unlock_ipi_call_lock(void);
#define MAX_APICID 256
-extern u8 x86_cpu_to_apicid[];
+extern u8 __initdata x86_cpu_to_apicid_init[];
+extern void *x86_cpu_to_apicid_ptr;
+DECLARE_PER_CPU(u8, x86_cpu_to_apicid);
-#define cpu_physical_id(cpu) x86_cpu_to_apicid[cpu]
+#define cpu_physical_id(cpu) per_cpu(x86_cpu_to_apicid, cpu)
extern void set_cpu_sibling_map(int cpu);
--- a/include/asm-x86_64/ipi.h
+++ b/include/asm-x86_64/ipi.h
@@ -119,7 +119,7 @@
*/
local_irq_save(flags);
for_each_cpu_mask(query_cpu, mask) {
- __send_IPI_dest_field(x86_cpu_to_apicid[query_cpu],
+ __send_IPI_dest_field(per_cpu(x86_cpu_to_apicid, query_cpu),
vector, APIC_DEST_PHYSICAL);
}
local_irq_restore(flags);
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -85,7 +85,9 @@
* Some lowlevel functions might want to know about
* the real APIC ID <-> CPU # mapping.
*/
-extern u8 x86_cpu_to_apicid[NR_CPUS]; /* physical ID */
+extern u8 __initdata x86_cpu_to_apicid_init[];
+extern void *x86_cpu_to_apicid_ptr;
+DECLARE_PER_CPU(u8, x86_cpu_to_apicid); /* physical ID */
extern u8 bios_cpu_apicid[];
static inline int cpu_present_to_apicid(int mps_cpu)
@@ -116,7 +118,7 @@
}
#ifdef CONFIG_SMP
-#define cpu_physical_id(cpu) x86_cpu_to_apicid[cpu]
+#define cpu_physical_id(cpu) per_cpu(x86_cpu_to_apicid, cpu)
#else
#define cpu_physical_id(cpu) boot_cpu_id
#endif /* !CONFIG_SMP */
--- a/arch/x86_64/kernel/genapic_flat.c
+++ b/arch/x86_64/kernel/genapic_flat.c
@@ -172,7 +172,7 @@
*/
cpu = first_cpu(cpumask);
if ((unsigned)cpu < NR_CPUS)
- return x86_cpu_to_apicid[cpu];
+ return per_cpu(x86_cpu_to_apicid, cpu);
else
return BAD_APICID;
}
--- a/arch/x86_64/kernel/genapic.c
+++ b/arch/x86_64/kernel/genapic.c
@@ -24,10 +24,19 @@
#include <acpi/acpi_bus.h>
#endif
-/* which logical CPU number maps to which CPU (physical APIC ID) */
-u8 x86_cpu_to_apicid[NR_CPUS] __read_mostly
+/*
+ * which logical CPU number maps to which CPU (physical APIC ID)
+ *
+ * The following static array is used during kernel startup
+ * and the x86_cpu_to_apicid_ptr contains the address of the
+ * array during this time. Is it zeroed when the per_cpu
+ * data area is removed.
+ */
+u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata
= { [0 ... NR_CPUS-1] = BAD_APICID };
-EXPORT_SYMBOL(x86_cpu_to_apicid);
+void *x86_cpu_to_apicid_ptr;
+DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
+EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid);
struct genapic __read_mostly *genapic = &apic_flat;
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -276,6 +276,11 @@
dmi_scan_machine();
+#ifdef CONFIG_SMP
+ /* setup to use the static apicid table during kernel startup */
+ x86_cpu_to_apicid_ptr = (void *)&x86_cpu_to_apicid_init;
+#endif
+
#ifdef CONFIG_ACPI
/*
* Initialize the ACPI boot-time table parser (gets the RSDP and SDT).
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 06/10] x86: Convert cpu_llc_id to be a per cpu variable (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (4 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 05/10] x86: Convert x86_cpu_to_apicid " travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3) travis
` (5 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
Convert cpu_llc_id from a static array sized by NR_CPUS to a
per_cpu variable. This saves sizeof(cpu_llc_id) * NR unused
cpus. Access is mostly from startup and CPU HOTPLUG functions.
Note there's an addtional change of the type of cpu_llc_id
from int to u8 for ARCH i386 to correspond with the same
type in ARCH x86_64.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/i386/kernel/cpu/intel_cacheinfo.c | 4 ++--
arch/i386/kernel/smpboot.c | 6 +++---
arch/x86_64/kernel/smpboot.c | 6 +++---
include/asm-i386/processor.h | 6 +++++-
include/asm-x86_64/smp.h | 9 ++++-----
5 files changed, 17 insertions(+), 14 deletions(-)
--- a/arch/i386/kernel/cpu/intel_cacheinfo.c
+++ b/arch/i386/kernel/cpu/intel_cacheinfo.c
@@ -417,14 +417,14 @@
if (new_l2) {
l2 = new_l2;
#ifdef CONFIG_X86_HT
- cpu_llc_id[cpu] = l2_id;
+ per_cpu(cpu_llc_id, cpu) = l2_id;
#endif
}
if (new_l3) {
l3 = new_l3;
#ifdef CONFIG_X86_HT
- cpu_llc_id[cpu] = l3_id;
+ per_cpu(cpu_llc_id, cpu) = l3_id;
#endif
}
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -67,7 +67,7 @@
EXPORT_SYMBOL(smp_num_siblings);
/* Last level cache ID of each logical CPU */
-int cpu_llc_id[NR_CPUS] __cpuinitdata = {[0 ... NR_CPUS-1] = BAD_APICID};
+DEFINE_PER_CPU(u8, cpu_llc_id) = BAD_APICID;
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU(cpumask_t, cpu_sibling_map);
@@ -348,8 +348,8 @@
}
for_each_cpu_mask(i, cpu_sibling_setup_map) {
- if (cpu_llc_id[cpu] != BAD_APICID &&
- cpu_llc_id[cpu] == cpu_llc_id[i]) {
+ if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
+ per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
cpu_set(i, c[cpu].llc_shared_map);
cpu_set(cpu, c[i].llc_shared_map);
}
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -65,7 +65,7 @@
EXPORT_SYMBOL(smp_num_siblings);
/* Last level cache ID of each logical CPU */
-u8 cpu_llc_id[NR_CPUS] __cpuinitdata = {[0 ... NR_CPUS-1] = BAD_APICID};
+DEFINE_PER_CPU(u8, cpu_llc_id) = BAD_APICID;
/* Bitmask of currently online CPUs */
cpumask_t cpu_online_map __read_mostly;
@@ -285,8 +285,8 @@
}
for_each_cpu_mask(i, cpu_sibling_setup_map) {
- if (cpu_llc_id[cpu] != BAD_APICID &&
- cpu_llc_id[cpu] == cpu_llc_id[i]) {
+ if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
+ per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
cpu_set(i, c[cpu].llc_shared_map);
cpu_set(cpu, c[i].llc_shared_map);
}
--- a/include/asm-i386/processor.h
+++ b/include/asm-i386/processor.h
@@ -110,7 +110,11 @@
#define current_cpu_data boot_cpu_data
#endif
-extern int cpu_llc_id[NR_CPUS];
+/*
+ * the following now lives in the per cpu area:
+ * extern int cpu_llc_id[NR_CPUS];
+ */
+DECLARE_PER_CPU(u8, cpu_llc_id);
extern char ignore_fpu_irq;
void __init cpu_detect(struct cpuinfo_x86 *c);
--- a/include/asm-x86_64/smp.h
+++ b/include/asm-x86_64/smp.h
@@ -39,16 +39,14 @@
extern void smp_send_reschedule(int cpu);
/*
- * cpu_sibling_map and cpu_core_map now live
- * in the per cpu area
- *
+ * the following now live in the per cpu area:
* extern cpumask_t cpu_sibling_map[NR_CPUS];
* extern cpumask_t cpu_core_map[NR_CPUS];
+ * extern u8 cpu_llc_id[NR_CPUS];
*/
DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
DECLARE_PER_CPU(cpumask_t, cpu_core_map);
-
-extern u8 cpu_llc_id[NR_CPUS];
+DECLARE_PER_CPU(u8, cpu_llc_id);
#define SMP_TRAMPOLINE_BASE 0x6000
@@ -120,6 +118,7 @@
#ifdef CONFIG_SMP
#define cpu_physical_id(cpu) per_cpu(x86_cpu_to_apicid, cpu)
#else
+extern unsigned int boot_cpu_id;
#define cpu_physical_id(cpu) boot_cpu_id
#endif /* !CONFIG_SMP */
#endif
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (5 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 06/10] x86: Convert cpu_llc_id " travis
@ 2007-09-12 1:56 ` travis
2007-09-12 1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
` (4 subsequent siblings)
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
This is from an earlier message from Christoph Lameter:
processor_core.c currently tries to determine the apicid by special casing
for IA64 and x86. The desired information is readily available via
cpu_physical_id()
on IA64, i386 and x86_64.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Additionally, boot_cpu_id needed to be exported to fix compile errors in
dma code when !CONFIG_SMP.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86_64/kernel/mpparse.c | 2 ++
drivers/acpi/processor_core.c | 8 +-------
2 files changed, 3 insertions(+), 7 deletions(-)
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -419,12 +419,6 @@
return 0;
}
-#ifdef CONFIG_IA64
-#define arch_cpu_to_apicid ia64_cpu_to_sapicid
-#else
-#define arch_cpu_to_apicid x86_cpu_to_apicid
-#endif
-
static int map_madt_entry(u32 acpi_id)
{
unsigned long madt_end, entry;
@@ -498,7 +492,7 @@
return apic_id;
for (i = 0; i < NR_CPUS; ++i) {
- if (arch_cpu_to_apicid[i] == apic_id)
+ if (cpu_physical_id(i) == apic_id)
return i;
}
return -1;
--- a/arch/x86_64/kernel/mpparse.c
+++ b/arch/x86_64/kernel/mpparse.c
@@ -57,6 +57,8 @@
/* Processor that is doing the boot up */
unsigned int boot_cpu_id = -1U;
+EXPORT_SYMBOL(boot_cpu_id);
+
/* Internal processor count */
unsigned int num_processors __cpuinitdata = 0;
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (6 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 07/10] x86: acpi-use-cpu_physical_id (v3) travis
@ 2007-09-12 1:56 ` travis
2007-09-28 9:49 ` Paul Jackson
2007-09-12 1:56 ` [PATCH 09/10] ppc64: " travis
` (3 subsequent siblings)
11 siblings, 1 reply; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
Convert cpu_sibling_map to a per_cpu cpumask_t array for the ia64
architecture. This fixes build errors in block/blktrace.c and
kernel/sched.c when CONFIG_SCHED_SMT is defined.
There was one access to cpu_sibling_map before the per_cpu data
area was created, so that step was moved to after the per_cpu
area is setup.
Tested and verified on an A4700.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/ia64/kernel/setup.c | 4 ----
arch/ia64/kernel/smpboot.c | 18 ++++++++++--------
arch/ia64/mm/contig.c | 6 ++++++
include/asm-ia64/smp.h | 2 +-
include/asm-ia64/topology.h | 2 +-
5 files changed, 18 insertions(+), 14 deletions(-)
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -528,10 +528,6 @@
#ifdef CONFIG_SMP
cpu_physical_id(0) = hard_smp_processor_id();
-
- cpu_set(0, cpu_sibling_map[0]);
- cpu_set(0, cpu_core_map[0]);
-
check_for_logical_procs();
if (smp_num_cpucores > 1)
printk(KERN_INFO
--- a/arch/ia64/kernel/smpboot.c
+++ b/arch/ia64/kernel/smpboot.c
@@ -138,7 +138,9 @@
EXPORT_SYMBOL(cpu_possible_map);
cpumask_t cpu_core_map[NR_CPUS] __cacheline_aligned;
-cpumask_t cpu_sibling_map[NR_CPUS] __cacheline_aligned;
+DEFINE_PER_CPU_SHARED_ALIGNED(cpumask_t, cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
+
int smp_num_siblings = 1;
int smp_num_cpucores = 1;
@@ -650,12 +652,12 @@
{
int i;
- for_each_cpu_mask(i, cpu_sibling_map[cpu])
- cpu_clear(cpu, cpu_sibling_map[i]);
+ for_each_cpu_mask(i, per_cpu(cpu_sibling_map, cpu))
+ cpu_clear(cpu, per_cpu(cpu_sibling_map, i));
for_each_cpu_mask(i, cpu_core_map[cpu])
cpu_clear(cpu, cpu_core_map[i]);
- cpu_sibling_map[cpu] = cpu_core_map[cpu] = CPU_MASK_NONE;
+ per_cpu(cpu_sibling_map, cpu) = cpu_core_map[cpu] = CPU_MASK_NONE;
}
static void
@@ -666,7 +668,7 @@
if (cpu_data(cpu)->threads_per_core == 1 &&
cpu_data(cpu)->cores_per_socket == 1) {
cpu_clear(cpu, cpu_core_map[cpu]);
- cpu_clear(cpu, cpu_sibling_map[cpu]);
+ cpu_clear(cpu, per_cpu(cpu_sibling_map, cpu));
return;
}
@@ -807,8 +809,8 @@
cpu_set(i, cpu_core_map[cpu]);
cpu_set(cpu, cpu_core_map[i]);
if (cpu_data(cpu)->core_id == cpu_data(i)->core_id) {
- cpu_set(i, cpu_sibling_map[cpu]);
- cpu_set(cpu, cpu_sibling_map[i]);
+ cpu_set(i, per_cpu(cpu_sibling_map, cpu));
+ cpu_set(cpu, per_cpu(cpu_sibling_map, i));
}
}
}
@@ -839,7 +841,7 @@
if (cpu_data(cpu)->threads_per_core == 1 &&
cpu_data(cpu)->cores_per_socket == 1) {
- cpu_set(cpu, cpu_sibling_map[cpu]);
+ cpu_set(cpu, per_cpu(cpu_sibling_map, cpu));
cpu_set(cpu, cpu_core_map[cpu]);
return 0;
}
--- a/include/asm-ia64/smp.h
+++ b/include/asm-ia64/smp.h
@@ -58,7 +58,7 @@
extern cpumask_t cpu_online_map;
extern cpumask_t cpu_core_map[NR_CPUS];
-extern cpumask_t cpu_sibling_map[NR_CPUS];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
extern int smp_num_siblings;
extern int smp_num_cpucores;
extern void __iomem *ipi_base_addr;
--- a/include/asm-ia64/topology.h
+++ b/include/asm-ia64/topology.h
@@ -112,7 +112,7 @@
#define topology_physical_package_id(cpu) (cpu_data(cpu)->socket_id)
#define topology_core_id(cpu) (cpu_data(cpu)->core_id)
#define topology_core_siblings(cpu) (cpu_core_map[cpu])
-#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu) (per_cpu(cpu_sibling_map, cpu))
#define smt_capable() (smp_num_siblings > 1)
#endif
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -212,6 +212,12 @@
cpu_data += PERCPU_PAGE_SIZE;
per_cpu(local_per_cpu_offset, cpu) = __per_cpu_offset[cpu];
}
+ /*
+ * cpu_sibling_map is now a per_cpu variable - it needs to
+ * be accessed after per_cpu_init() sets up the per_cpu area.
+ */
+ cpu_set(0, per_cpu(cpu_sibling_map, 0));
+ cpu_set(0, cpu_core_map[0]);
}
return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
}
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-12 1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
@ 2007-09-28 9:49 ` Paul Jackson
2007-10-03 19:22 ` Mike Travis
0 siblings, 1 reply; 18+ messages in thread
From: Paul Jackson @ 2007-09-28 9:49 UTC (permalink / raw)
To: travis; +Cc: linux-mm, ak, linux-kernel, linuxppc-dev, sparclinux, akpm,
clameter
Mike,
I think there is a bug either in this ia64 patch, or in the related
generic arch patch: Convert cpu_sibling_map to be a per cpu variable
(v3).
It dies early in boot on me, on the SGI internal 8 processor IA64
system that you and I know as 'margin'. The death is a hard hang, due
to a corrupt stack, due to a bogus cpu index.
I haven't tracked it down all the way, but have gotten this far. If I add
the following patch, I get a panic on the BUG_ON if I have these two patches
in 2.6.23-rc8-mm1, but it boots just fine if I don't have these two patches.
It seems that the "cpu_sibling_map[cpu]" cpumask_t is empty (all zero
bits) with your two patches applied, but has some non-zero bits
otherwise, which leads to 'group' being NR_CPUS instead of a useful CPU
number. Unfortunately, I have no idea why the "cpu_sibling_map[cpu]"
cpumask_t is empty -- good luck on that part.
The patch that catches this bug earlier is this:
--- 2.6.23-rc8-mm1.orig/kernel/sched.c 2007-09-28 01:42:20.144561024 -0700
+++ 2.6.23-rc8-mm1/kernel/sched.c 2007-09-28 02:27:14.239075497 -0700
@@ -5905,6 +5905,7 @@ static int cpu_to_phys_group(int cpu, co
#else
group = cpu;
#endif
+ BUG_ON(group == NR_CPUS);
if (sg)
*sg = &per_cpu(sched_group_phys, group);
return group;
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-28 9:49 ` Paul Jackson
@ 2007-10-03 19:22 ` Mike Travis
0 siblings, 0 replies; 18+ messages in thread
From: Mike Travis @ 2007-10-03 19:22 UTC (permalink / raw)
To: Paul Jackson
Cc: linux-mm, ak, linux-kernel, linuxppc-dev, sparclinux, akpm,
clameter
Hi Paul,
I just now found this. I'll take a look immediately. I tried it
on a couple of systems but not margin.
Thanks,
Mike
Paul Jackson wrote:
> Mike,
>
> I think there is a bug either in this ia64 patch, or in the related
> generic arch patch: Convert cpu_sibling_map to be a per cpu variable
> (v3).
>
> It dies early in boot on me, on the SGI internal 8 processor IA64
> system that you and I know as 'margin'. The death is a hard hang, due
> to a corrupt stack, due to a bogus cpu index.
>
> I haven't tracked it down all the way, but have gotten this far. If I add
> the following patch, I get a panic on the BUG_ON if I have these two patches
> in 2.6.23-rc8-mm1, but it boots just fine if I don't have these two patches.
>
> It seems that the "cpu_sibling_map[cpu]" cpumask_t is empty (all zero
> bits) with your two patches applied, but has some non-zero bits
> otherwise, which leads to 'group' being NR_CPUS instead of a useful CPU
> number. Unfortunately, I have no idea why the "cpu_sibling_map[cpu]"
> cpumask_t is empty -- good luck on that part.
>
> The patch that catches this bug earlier is this:
>
> --- 2.6.23-rc8-mm1.orig/kernel/sched.c 2007-09-28 01:42:20.144561024 -0700
> +++ 2.6.23-rc8-mm1/kernel/sched.c 2007-09-28 02:27:14.239075497 -0700
> @@ -5905,6 +5905,7 @@ static int cpu_to_phys_group(int cpu, co
> #else
> group = cpu;
> #endif
> + BUG_ON(group == NR_CPUS);
> if (sg)
> *sg = &per_cpu(sched_group_phys, group);
> return group;
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (7 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 08/10] ia64: Convert cpu_sibling_map to a per_cpu data array (v3) travis
@ 2007-09-12 1:56 ` travis
2007-09-17 6:28 ` Stephen Rothwell
2007-09-12 1:56 ` [PATCH 10/10] sparc64: " travis
` (2 subsequent siblings)
11 siblings, 1 reply; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
Convert cpu_sibling_map to a per_cpu cpumask_t array for the ppc64
architecture. This fixes build errors in block/blktrace.c and
kernel/sched.c when CONFIG_SCHED_SMT is defined.
Note: these changes have not been built nor tested.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/powerpc/kernel/setup-common.c | 4 ++--
arch/powerpc/kernel/smp.c | 4 ++--
arch/powerpc/platforms/cell/cbe_cpufreq.c | 2 +-
include/asm-powerpc/smp.h | 4 +++-
include/asm-powerpc/topology.h | 2 +-
5 files changed, 9 insertions(+), 7 deletions(-)
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -415,9 +415,9 @@
* Do the sibling map; assume only two threads per processor.
*/
for_each_possible_cpu(cpu) {
- cpu_set(cpu, cpu_sibling_map[cpu]);
+ cpu_set(cpu, cpu_sibling_map(cpu));
if (cpu_has_feature(CPU_FTR_SMT))
- cpu_set(cpu ^ 0x1, cpu_sibling_map[cpu]);
+ cpu_set(cpu ^ 0x1, cpu_sibling_map(cpu));
}
vdso_data->processorCount = num_present_cpus();
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -61,11 +61,11 @@
cpumask_t cpu_possible_map = CPU_MASK_NONE;
cpumask_t cpu_online_map = CPU_MASK_NONE;
-cpumask_t cpu_sibling_map[NR_CPUS] = { [0 ... NR_CPUS-1] = CPU_MASK_NONE };
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map) = CPU_MASK_NONE;
EXPORT_SYMBOL(cpu_online_map);
EXPORT_SYMBOL(cpu_possible_map);
-EXPORT_SYMBOL(cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
/* SMP operations for this machine */
struct smp_ops_t *smp_ops;
--- a/arch/powerpc/platforms/cell/cbe_cpufreq.c
+++ b/arch/powerpc/platforms/cell/cbe_cpufreq.c
@@ -119,7 +119,7 @@
policy->cur = cbe_freqs[cur_pmode].frequency;
#ifdef CONFIG_SMP
- policy->cpus = cpu_sibling_map[policy->cpu];
+ policy->cpus = cpu_sibling_map(policy->cpu);
#endif
cpufreq_frequency_table_get_attr(cbe_freqs, policy->cpu);
--- a/include/asm-powerpc/smp.h
+++ b/include/asm-powerpc/smp.h
@@ -25,6 +25,7 @@
#ifdef CONFIG_PPC64
#include <asm/paca.h>
+#include <asm/percpu.h>
#endif
extern int boot_cpuid;
@@ -58,7 +59,8 @@
(smp_hw_index[(cpu)] = (phys))
#endif
-extern cpumask_t cpu_sibling_map[NR_CPUS];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
+#define cpu_sibling_map(cpu) per_cpu(cpu_sibling_map, cpu)
/* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
*
--- a/include/asm-powerpc/topology.h
+++ b/include/asm-powerpc/topology.h
@@ -108,7 +108,7 @@
#ifdef CONFIG_PPC64
#include <asm/smp.h>
-#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu) (cpu_sibling_map(cpu))
#endif
#endif
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-12 1:56 ` [PATCH 09/10] ppc64: " travis
@ 2007-09-17 6:28 ` Stephen Rothwell
2007-09-17 6:39 ` Stephen Rothwell
2007-09-17 15:22 ` Mike Travis
0 siblings, 2 replies; 18+ messages in thread
From: Stephen Rothwell @ 2007-09-17 6:28 UTC (permalink / raw)
To: travis
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Andrew Morton, Christoph Lameter
[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]
On Tue, 11 Sep 2007 18:56:53 -0700 travis@sgi.com wrote:
>
> Convert cpu_sibling_map to a per_cpu cpumask_t array for the ppc64
> architecture. This fixes build errors in block/blktrace.c and
> kernel/sched.c when CONFIG_SCHED_SMT is defined.
>
> Note: these changes have not been built nor tested.
After applying all 10 patches, the ppc64_defconfig builds but:
vmlinux is larger:
text data bss dec hex filename
7705776 1756984 504624 9967384 981718 ppc64/vmlinux
7706228 1757120 504624 9967972 981964 trav.bld/vmlinux
the topology (on my POWERPC5+ box) is not correct:
cpu0/topology/thread_siblings:0000000f
cpu1/topology/thread_siblings:0000000f
cpu2/topology/thread_siblings:0000000f
cpu3/topology/thread_siblings:0000000f
it used to be:
cpu0/topology/thread_siblings:00000003
cpu1/topology/thread_siblings:00000003
cpu2/topology/thread_siblings:0000000c
cpu3/topology/thread_siblings:0000000c
Similarly on my iSeries box, the topology is displayed as above
while it used to be:
cpu0/topology/thread_siblings:00000001
cpu1/topology/thread_siblings:00000002
cpu2/topology/thread_siblings:00000004
cpu3/topology/thread_siblings:00000008
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-17 6:28 ` Stephen Rothwell
@ 2007-09-17 6:39 ` Stephen Rothwell
2007-09-17 15:22 ` Mike Travis
1 sibling, 0 replies; 18+ messages in thread
From: Stephen Rothwell @ 2007-09-17 6:39 UTC (permalink / raw)
To: travis
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Andrew Morton, Christoph Lameter
[-- Attachment #1: Type: text/plain, Size: 718 bytes --]
On Mon, 17 Sep 2007 16:28:31 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> the topology (on my POWERPC5+ box) is not correct:
>
> cpu0/topology/thread_siblings:0000000f
> cpu1/topology/thread_siblings:0000000f
> cpu2/topology/thread_siblings:0000000f
> cpu3/topology/thread_siblings:0000000f
>
> it used to be:
>
> cpu0/topology/thread_siblings:00000003
> cpu1/topology/thread_siblings:00000003
> cpu2/topology/thread_siblings:0000000c
> cpu3/topology/thread_siblings:0000000c
This would be because we are setting up the cpu_sibling map before we
call setup_per_cpu_areas().
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 09/10] ppc64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-17 6:28 ` Stephen Rothwell
2007-09-17 6:39 ` Stephen Rothwell
@ 2007-09-17 15:22 ` Mike Travis
1 sibling, 0 replies; 18+ messages in thread
From: Mike Travis @ 2007-09-17 15:22 UTC (permalink / raw)
To: Stephen Rothwell
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Andrew Morton, Christoph Lameter
Stephen Rothwell wrote:
> On Tue, 11 Sep 2007 18:56:53 -0700 travis@sgi.com wrote:
>> Convert cpu_sibling_map to a per_cpu cpumask_t array for the ppc64
>> architecture. This fixes build errors in block/blktrace.c and
>> kernel/sched.c when CONFIG_SCHED_SMT is defined.
>>
>> Note: these changes have not been built nor tested.
>
> After applying all 10 patches, the ppc64_defconfig builds but:
>
> vmlinux is larger:
>
> text data bss dec hex filename
> 7705776 1756984 504624 9967384 981718 ppc64/vmlinux
> 7706228 1757120 504624 9967972 981964 trav.bld/vmlinux
>
> the topology (on my POWERPC5+ box) is not correct:
>
> cpu0/topology/thread_siblings:0000000f
> cpu1/topology/thread_siblings:0000000f
> cpu2/topology/thread_siblings:0000000f
> cpu3/topology/thread_siblings:0000000f
>
> it used to be:
>
> cpu0/topology/thread_siblings:00000003
> cpu1/topology/thread_siblings:00000003
> cpu2/topology/thread_siblings:0000000c
> cpu3/topology/thread_siblings:0000000c
>
> Similarly on my iSeries box, the topology is displayed as above
> while it used to be:
>
> cpu0/topology/thread_siblings:00000001
> cpu1/topology/thread_siblings:00000002
> cpu2/topology/thread_siblings:00000004
> cpu3/topology/thread_siblings:00000008
>
Thanks Stephen for the feedback. It may be the same situation
that some of the other arch's encounter in that the per_cpu
area is being accessed before it's setup. I'll investigate
that a bit more.
Mike
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 10/10] sparc64: Convert cpu_sibling_map to a per_cpu data array (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (8 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 09/10] ppc64: " travis
@ 2007-09-12 1:56 ` travis
2007-09-13 9:53 ` [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) Andi Kleen
2007-09-14 23:32 ` Andrew Morton
11 siblings, 0 replies; 18+ messages in thread
From: travis @ 2007-09-12 1:56 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
Convert cpu_sibling_map to a per_cpu cpumask_t array for the sparc64
architecture. This fixes build errors in block/blktrace.c and
kernel/sched.c when CONFIG_SCHED_SMT is defined.
Note: these changes have not been built nor tested.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/sparc64/kernel/smp.c | 17 ++++++++---------
include/asm-sparc64/smp.h | 3 ++-
include/asm-sparc64/topology.h | 2 +-
3 files changed, 11 insertions(+), 11 deletions(-)
--- a/arch/sparc64/kernel/smp.c
+++ b/arch/sparc64/kernel/smp.c
@@ -52,14 +52,13 @@
cpumask_t cpu_possible_map __read_mostly = CPU_MASK_NONE;
cpumask_t cpu_online_map __read_mostly = CPU_MASK_NONE;
-cpumask_t cpu_sibling_map[NR_CPUS] __read_mostly =
- { [0 ... NR_CPUS-1] = CPU_MASK_NONE };
+DEFINE_PER_CPU(cpumask_t, cpu_sibling_map) = CPU_MASK_NONE;
cpumask_t cpu_core_map[NR_CPUS] __read_mostly =
{ [0 ... NR_CPUS-1] = CPU_MASK_NONE };
EXPORT_SYMBOL(cpu_possible_map);
EXPORT_SYMBOL(cpu_online_map);
-EXPORT_SYMBOL(cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
EXPORT_SYMBOL(cpu_core_map);
static cpumask_t smp_commenced_mask;
@@ -1259,16 +1258,16 @@
for_each_present_cpu(i) {
unsigned int j;
- cpus_clear(cpu_sibling_map[i]);
+ cpus_clear(per_cpu(cpu_sibling_map, i));
if (cpu_data(i).proc_id == -1) {
- cpu_set(i, cpu_sibling_map[i]);
+ cpu_set(i, per_cpu(cpu_sibling_map, i));
continue;
}
for_each_present_cpu(j) {
if (cpu_data(i).proc_id ==
cpu_data(j).proc_id)
- cpu_set(j, cpu_sibling_map[i]);
+ cpu_set(j, per_cpu(cpu_sibling_map, i));
}
}
}
@@ -1340,9 +1339,9 @@
cpu_clear(cpu, cpu_core_map[i]);
cpus_clear(cpu_core_map[cpu]);
- for_each_cpu_mask(i, cpu_sibling_map[cpu])
- cpu_clear(cpu, cpu_sibling_map[i]);
- cpus_clear(cpu_sibling_map[cpu]);
+ for_each_cpu_mask(i, per_cpu(cpu_sibling_map, cpu))
+ cpu_clear(cpu, per_cpu(cpu_sibling_map, i));
+ cpus_clear(per_cpu(cpu_sibling_map, cpu));
c = &cpu_data(cpu);
--- a/include/asm-sparc64/smp.h
+++ b/include/asm-sparc64/smp.h
@@ -28,8 +28,9 @@
#include <asm/bitops.h>
#include <asm/atomic.h>
+#include <asm/percpu.h>
-extern cpumask_t cpu_sibling_map[NR_CPUS];
+DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
extern cpumask_t cpu_core_map[NR_CPUS];
extern int sparc64_multi_core;
--- a/include/asm-sparc64/topology.h
+++ b/include/asm-sparc64/topology.h
@@ -5,7 +5,7 @@
#define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
#define topology_core_id(cpu) (cpu_data(cpu).core_id)
#define topology_core_siblings(cpu) (cpu_core_map[cpu])
-#define topology_thread_siblings(cpu) (cpu_sibling_map[cpu])
+#define topology_thread_siblings(cpu) (per_cpu(cpu_sibling_map, cpu))
#define mc_capable() (sparc64_multi_core)
#define smt_capable() (sparc64_multi_core)
#endif /* CONFIG_SMP */
--
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (9 preceding siblings ...)
2007-09-12 1:56 ` [PATCH 10/10] sparc64: " travis
@ 2007-09-13 9:53 ` Andi Kleen
2007-09-14 23:32 ` Andrew Morton
11 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2007-09-13 9:53 UTC (permalink / raw)
To: travis
Cc: linux-mm, linux-kernel, linuxppc-dev, sparclinux, Andrew Morton,
Christoph Lameter
On Wednesday 12 September 2007 03:56, travis@sgi.com wrote:
> Note:
>
> This patch consolidates all the previous patches regarding
> the conversion of static arrays sized by NR_CPUS into per_cpu
> data arrays and is referenced against 2.6.23-rc6 .
Looks good to me from the x86 side. I'll leave it to Andrew to
handle for now though because it touches too many files
outside x86.
-Andi
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3)
2007-09-12 1:56 [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) travis
` (10 preceding siblings ...)
2007-09-13 9:53 ` [PATCH 00/10] x86: Reduce Memory Usage and Inter-Node message traffic (v3) Andi Kleen
@ 2007-09-14 23:32 ` Andrew Morton
11 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2007-09-14 23:32 UTC (permalink / raw)
To: travis
Cc: linux-mm, Andi Kleen, linux-kernel, linuxppc-dev, sparclinux,
Christoph Lameter
On Tue, 11 Sep 2007 18:56:44 -0700
travis@sgi.com wrote:
> Changes for version v3:
>
> cpu_sibling_map has been converted to a per_cpu data array to fix
> build errors on ia64, ppc64 and sparc64 to accomodate references in
> block/blktrace.c and kernel/sched.c when CONFIG_SCHED_SMT is defined.
>
> Warning: ppc64 and sparc64 have not yet been built nor tested.
These patches all seem to be unaltered from what I had.
The first patch (x86: remove x86_cpu_to_log_apicid array) is in Andi's
tree as x86_64-mm-remove-x86_cpu_to_log_apicid.patch so I don't apply that.
The sparc64/ppc64/ia64 convert-cpu_sibling_map-to-a-per_cpu-data-array
patches need to be folded into the base patch so that we don't break the build
at any stage.
So what I ended up with was
x86-fix-cpu_to_node-references.patch
x86-convert-cpu_core_map-to-be-a-per-cpu-variable.patch
convert-cpu_sibling_map-to-be-a-per-cpu-variable.patch
convert-cpu_sibling_map-to-a-per_cpu-data-array-ia64.patch
convert-cpu_sibling_map-to-a-per_cpu-data-array-ppc64.patch
convert-cpu_sibling_map-to-a-per_cpu-data-array-sparc64.patch
x86-convert-x86_cpu_to_apicid-to-be-a-per-cpu-variable.patch
x86-convert-cpu_llc_id-to-be-a-per-cpu-variable.patch
where the four convert-cpu_sibling_map-to-* will be clumped into a
single diff.
^ permalink raw reply [flat|nested] 18+ messages in thread