public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
@ 2007-07-22  0:49 Yinghai Lu
  2007-07-22  8:05 ` David Rientjes
  2007-07-22 12:52 ` Andi Kleen
  0 siblings, 2 replies; 6+ messages in thread
From: Yinghai Lu @ 2007-07-22  0:49 UTC (permalink / raw)
  To: Andi Kleen, Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

[PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node

When acpi=off or there is no SRAT defined, apicid_to_node is got from K8
Northbridge PCI configuration space in k8_scan_nodes() in
arch/x86_64/mm/k8toplogy.c.
The problem is that it assumes bsp apic id is 0 at that point.
For four socket system with Quad core cpus installed, all cpus apic id
is offset by 4, and bsp apic id is 4.
For eight socket system with dual core cpus installed, all cpus apic id
is offset by 2, and bsp apic id is 2.
We need offset apicid_to_node array according to boot_cpu_id.--- bsp apic id.
before we use apicid_to_node array.
boot_cpu_id is only valid init_apic_mappings.

So do update_apicid_to_node and init_cpu_to_node after init_apic_mappings

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

 arch/x86_64/kernel/setup.c |   10 ++++++++--
 arch/x86_64/mm/numa.c      |   24 ++++++++++++++++++++++++
 include/asm-x86_64/numa.h  |    2 ++
 3 files changed, 34 insertions(+), 2 deletions(-)

[-- Attachment #2: 3.patch --]
[-- Type: text/x-patch, Size: 2249 bytes --]


diff --git a/arch/x86_64/kernel/setup.c b/arch/x86_64/kernel/setup.c
index 33ef718..1862a91 100644
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -385,8 +385,6 @@ void __init setup_arch(char **cmdline_p)
 	acpi_boot_init();
 #endif
 
-	init_cpu_to_node();
-
 	/*
 	 * get boot-time SMP configuration:
 	 */
@@ -395,6 +393,14 @@ void __init setup_arch(char **cmdline_p)
 	init_apic_mappings();
 
 	/*
+	 * need to put init_cpu_to_node after init_apic_mappings
+	 * we need to get boot_cpu_id (the BSP apic id) to modify
+	 * apicid_to_node array, before init_cpu_node
+	 */
+	update_apicid_to_node();
+	init_cpu_to_node();
+
+	/*
 	 * We trust e820 completely. No explicit ROM probing in memory.
  	 */
 	e820_reserve_resources(); 
diff --git a/arch/x86_64/mm/numa.c b/arch/x86_64/mm/numa.c
index 5154894..91605aa 100644
--- a/arch/x86_64/mm/numa.c
+++ b/arch/x86_64/mm/numa.c
@@ -607,6 +607,30 @@ early_param("numa", numa_setup);
  * prior to this call, and this initialization is good enough
  * for the fake NUMA cases.
  */
+void __init update_apicid_to_node(void)
+{
+	/*
+	 * let modify apicid_to_node array when boot_cpu_id !=0
+	 * apicid_to_node[0] != NUMA_NODE
+	 */
+
+	int i;
+
+	/* there is no apic id offset */
+	if (!boot_cpu_id)
+		return;
+
+	/* check if it is already updated */
+	if (apicid_to_node[0] == NUMA_NO_NODE)
+		return;
+
+	for (i = NR_CPUS -1; i >= boot_cpu_id; i--)
+		apicid_to_node[i] = apicid_to_node[i - boot_cpu_id];
+
+	for (i = boot_cpu_id - 1; i >= 0; i--)
+		apicid_to_node[i] = NUMA_NO_NODE;
+
+}
 void __init init_cpu_to_node(void)
 {
 	int i;
diff --git a/include/asm-x86_64/numa.h b/include/asm-x86_64/numa.h
index 933ff11..282b7a7 100644
--- a/include/asm-x86_64/numa.h
+++ b/include/asm-x86_64/numa.h
@@ -21,6 +21,7 @@ extern int hotadd_percent;
 
 extern unsigned char apicid_to_node[256];
 #ifdef CONFIG_NUMA
+extern void __init update_apicid_to_node(void);
 extern void __init init_cpu_to_node(void);
 
 static inline void clear_node_cpumask(int cpu)
@@ -29,6 +30,7 @@ static inline void clear_node_cpumask(int cpu)
 }
 
 #else
+#define update_apicid_to_node() do {} while (0)
 #define init_cpu_to_node() do {} while (0)
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
  2007-07-22  0:49 [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node Yinghai Lu
@ 2007-07-22  8:05 ` David Rientjes
  2007-07-22  8:17   ` Yinghai Lu
  2007-07-22 12:52 ` Andi Kleen
  1 sibling, 1 reply; 6+ messages in thread
From: David Rientjes @ 2007-07-22  8:05 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Andi Kleen, Andrew Morton, linux-kernel

On Sat, 21 Jul 2007, Yinghai Lu wrote:

> [PATCH 3/3] x86_64: offset apicid_to_node before use it before
> init_cpu_to_node
> 

Does this preserve the correct fake apicid_to_node mapping for the 
emulated case such as numa=fake=32?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
  2007-07-22  8:05 ` David Rientjes
@ 2007-07-22  8:17   ` Yinghai Lu
  0 siblings, 0 replies; 6+ messages in thread
From: Yinghai Lu @ 2007-07-22  8:17 UTC (permalink / raw)
  To: David Rientjes; +Cc: Andi Kleen, Andrew Morton, linux-kernel

On 7/22/07, David Rientjes <rientjes@google.com> wrote:
> On Sat, 21 Jul 2007, Yinghai Lu wrote:
>
> > [PATCH 3/3] x86_64: offset apicid_to_node before use it before
> > init_cpu_to_node
> >
>
> Does this preserve the correct fake apicid_to_node mapping for the
> emulated case such as numa=fake=32?
>
+void __init update_apicid_to_node(void)
+{
+       /*
+        * let modify apicid_to_node array when boot_cpu_id !=0
+        * apicid_to_node[0] != NUMA_NODE
+        */
+
+       int i;
+
+       /* there is no apic id offset */
+       if (!boot_cpu_id)
+               return;
+
+       /* check if it is already updated */
+       if (apicid_to_node[0] == NUMA_NO_NODE)
+               return;

boot_cpu_id (bsp apic id ) is 0 under fake numa case?

or need to add some exit for fake numa mode.

YH

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
  2007-07-22  0:49 [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node Yinghai Lu
  2007-07-22  8:05 ` David Rientjes
@ 2007-07-22 12:52 ` Andi Kleen
  2007-07-22 22:54   ` Yinghai Lu
  1 sibling, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2007-07-22 12:52 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Andrew Morton, linux-kernel, joachim.deguara

On Sunday 22 July 2007 02:49:41 Yinghai Lu wrote:
> [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
> 
> When acpi=off or there is no SRAT defined, apicid_to_node is got from K8
> Northbridge PCI configuration space in k8_scan_nodes() in
> arch/x86_64/mm/k8toplogy.c.
> The problem is that it assumes bsp apic id is 0 at that point.
> For four socket system with Quad core cpus installed, all cpus apic id
> is offset by 4, and bsp apic id is 4.
> For eight socket system with dual core cpus installed, all cpus apic id
> is offset by 2, and bsp apic id is 2.
> We need offset apicid_to_node array according to boot_cpu_id.--- bsp apic id.
> before we use apicid_to_node array.
> boot_cpu_id is only valid init_apic_mappings.

<rant>
This thing is getting more and more messy. If it gets any more complicated
I promise I'll rip out the non ACPI support for quad core NUMA completely
and let it require ACPI.  Even the people who have a religious problem
with ACPI will need to eventually get over it and LinuxBIOS just has
to create proper tables, not pile hacks over hacks. It probably was a mistake in 
the first place to add it.
</rant>

I don't think you can mess with apicid_to_node[] unconditionally here. 
e.g. for the ACPI case or for the Intel NUMA case you'll just break everything.

What you should do is split init_apic_mappings() up and do a early
call that just checks if the CPU has an APIC and maps it using the 
fixmap and reads boot_cpu_id.  Then you can use that information
in k8topology.c to create correct tables.

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
  2007-07-22 12:52 ` Andi Kleen
@ 2007-07-22 22:54   ` Yinghai Lu
  2007-07-23  0:29     ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Yinghai Lu @ 2007-07-22 22:54 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, linux-kernel, joachim.deguara

On 7/22/07, Andi Kleen <ak@suse.de> wrote:
> On Sunday 22 July 2007 02:49:41 Yinghai Lu wrote:
> > [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
> >
> > When acpi=off or there is no SRAT defined, apicid_to_node is got from K8
> > Northbridge PCI configuration space in k8_scan_nodes() in
> > arch/x86_64/mm/k8toplogy.c.
> > The problem is that it assumes bsp apic id is 0 at that point.
> > For four socket system with Quad core cpus installed, all cpus apic id
> > is offset by 4, and bsp apic id is 4.
> > For eight socket system with dual core cpus installed, all cpus apic id
> > is offset by 2, and bsp apic id is 2.
> > We need offset apicid_to_node array according to boot_cpu_id.--- bsp apic id.
> > before we use apicid_to_node array.
> > boot_cpu_id is only valid init_apic_mappings.
>
> <rant>
> This thing is getting more and more messy. If it gets any more complicated
> I promise I'll rip out the non ACPI support for quad core NUMA completely
> and let it require ACPI.  Even the people who have a religious problem
> with ACPI will need to eventually get over it and LinuxBIOS just has
> to create proper tables, not pile hacks over hacks. It probably was a mistake in
> the first place to add it.
> </rant>
you will need to force every BIOS to have correct SRAT table.
>
> I don't think you can mess with apicid_to_node[] unconditionally here.
> e.g. for the ACPI case or for the Intel NUMA case you'll just break everything.
>
> What you should do is split init_apic_mappings() up and do a early
> call that just checks if the CPU has an APIC and maps it using the
> fixmap and reads boot_cpu_id.  Then you can use that information
> in k8topology.c to create correct tables.

sounds good, i try to split one init_lapic_mappings from init_apic_mappings

Thanks

YH

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node
  2007-07-22 22:54   ` Yinghai Lu
@ 2007-07-23  0:29     ` Andi Kleen
  0 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2007-07-23  0:29 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Andrew Morton, linux-kernel, joachim.deguara


> you will need to force every BIOS to have correct SRAT table.

They are normally correct. I'm not aware of wrong SRAT tables
in production systems.

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-07-23  0:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-22  0:49 [PATCH 3/3] x86_64: offset apicid_to_node before use it before init_cpu_to_node Yinghai Lu
2007-07-22  8:05 ` David Rientjes
2007-07-22  8:17   ` Yinghai Lu
2007-07-22 12:52 ` Andi Kleen
2007-07-22 22:54   ` Yinghai Lu
2007-07-23  0:29     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox