public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/15] Improve x86 smpboot integration
@ 2008-06-09 14:16 Glauber Costa
  2008-06-09 14:16 ` [PATCH 01/15] x86: use stack_start in x86_64 Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

Ingo,

First of all, sorry for being away for so long ;-)

Here it goes a series of improvements for the x86 smpboot integration.
The final goal is the same: To reduce the difference between architectures,
just this time I do a little bit more ;-)

Basically, this one reduces greatly the ifdef count:

[root@t60 linux-2.6-x86]# git-show 674ce:arch/x86/kernel/smpboot.c \
			  | grep -e "^#ifdef \+CONFIG_X86_\(32\|64\)" | wc -l
18
[root@t60 linux-2.6-x86]# git-show HEAD:arch/x86/kernel/smpboot.c \
			  | grep -e "^#ifdef \+CONFIG_X86_\(32\|64\)" | wc -l
8

The remaining ones are _mainly_ (not all) due to the fact that x86_64 uses the pda,
while i386 goes with normal per-cpu data. They are things like:

per_cpu(current_task, cpu) = c_idle.idle; vs cpu_pda(cpu)->pcurrent = c_idle.idle;

Also, there is the low mappings piece of code Hugh detected. To that, I intend
to also find a common base between them. Just I'm not doing it in this series, because
it deserves special attention.

As usual, this series was compiled tested in a whole bunch of different configs ( ~ 10
for each architecture), including all i386 variants. Boot tested in all my hardware.

The final diffstat is:

 arch/x86/kernel/head_32.S             |    2 
 arch/x86/kernel/head_64.S             |   48 +---------------
 arch/x86/kernel/io_apic_32.c          |    2 
 arch/x86/kernel/setup64.c             |    1 
 arch/x86/kernel/smpboot.c             |   97 ++++------------------------------
 b/arch/x86/kernel/acpi/sleep.c        |    2 
 b/arch/x86/kernel/apic_32.c           |    6 --
 b/arch/x86/kernel/apic_64.c           |   10 +++
 b/arch/x86/kernel/head_32.S           |    6 +-
 b/arch/x86/kernel/head_64.S           |    5 +
 b/arch/x86/kernel/io_apic_32.c        |    5 +
 b/arch/x86/kernel/io_apic_64.c        |    9 ++-
 b/arch/x86/kernel/process_32.c        |   16 +++++
 b/arch/x86/kernel/setup64.c           |    5 -
 b/arch/x86/kernel/setup_64.c          |   27 +++++++++
 b/arch/x86/kernel/smpboot.c           |    8 --
 b/arch/x86/kernel/traps_32.c          |    3 -
 b/arch/x86/kernel/x8664_ksyms_64.c    |    5 -
 b/arch/x86/mach-voyager/voyager_smp.c |    2 
 b/include/asm-x86/desc.h              |   24 +++-----
 b/include/asm-x86/hw_irq.h            |    3 -
 b/include/asm-x86/numa_32.h           |    1 
 b/include/asm-x86/segment.h           |   24 ++++----
 b/include/asm-x86/smp.h               |    2 
 include/asm-x86/hw_irq.h              |    4 -
 25 files changed, 122 insertions(+), 195 deletions(-)



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/15] x86: use stack_start in x86_64
  2008-06-09 14:16 [PATCH 0/15] Improve x86 smpboot integration Glauber Costa
@ 2008-06-09 14:16 ` Glauber Costa
  2008-06-09 14:16   ` [PATCH 02/15] x86: don't use gdt_page openly Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

call x86_64's init_rsp stack_start, just as i386 does.
Put a zeroed stack segment for consistency. With this,
we can eliminate one ugly ifdef in smpboot.c.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/acpi/sleep.c |    2 +-
 arch/x86/kernel/head_64.S    |    5 +++--
 arch/x86/kernel/smpboot.c    |    7 +------
 3 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index afc25ee..0103af2 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -72,7 +72,7 @@ int acpi_save_state_mem(void)
 	saved_magic = 0x12345678;
 #else /* CONFIG_64BIT */
 	header->trampoline_segment = setup_trampoline() >> 4;
-	init_rsp = (unsigned long)temp_stack + 4096;
+	stack_start.sp = temp_stack + 4096;
 	initial_code = (unsigned long)wakeup_long64;
 	saved_magic = 0x123456789abcdef0;
 #endif /* CONFIG_64BIT */
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d8ed325..4949c32 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -191,7 +191,7 @@ ENTRY(secondary_startup_64)
 	movq	%rax, %cr0
 
 	/* Setup a boot time stack */
-	movq init_rsp(%rip),%rsp
+	movq stack_start(%rip),%rsp
 
 	/* zero EFLAGS after setting rsp */
 	pushq $0
@@ -252,8 +252,9 @@ ENTRY(secondary_startup_64)
 	.quad	x86_64_start_kernel
 	__FINITDATA
 
-	ENTRY(init_rsp)
+	ENTRY(stack_start)
 	.quad  init_thread_union+THREAD_SIZE-8
+	.word  0
 
 bad_address:
 	jmp bad_address
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 722b24b..c77a7d2 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -714,11 +714,7 @@ wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
 	 * target processor state.
 	 */
 	startup_ipi_hook(phys_apicid, (unsigned long) start_secondary,
-#ifdef CONFIG_X86_64
-			 (unsigned long)init_rsp);
-#else
 			 (unsigned long)stack_start.sp);
-#endif
 
 	/*
 	 * Run STARTUP IPI loop.
@@ -905,15 +901,14 @@ do_rest:
 	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
 	c_idle.idle->thread.ip = (unsigned long) start_secondary;
 	/* Stack for startup_32 can be just as for start_secondary onwards */
-	stack_start.sp = (void *) c_idle.idle->thread.sp;
 	irq_ctx_init(cpu);
 #else
 	cpu_pda(cpu)->pcurrent = c_idle.idle;
-	init_rsp = c_idle.idle->thread.sp;
 	load_sp0(&per_cpu(init_tss, cpu), &c_idle.idle->thread);
 	initial_code = (unsigned long)start_secondary;
 	clear_tsk_thread_flag(c_idle.idle, TIF_FORK);
 #endif
+	stack_start.sp = (void *) c_idle.idle->thread.sp;
 
 	/* start_ip had better be page-aligned! */
 	start_ip = setup_trampoline();
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/15] x86: don't use gdt_page openly.
  2008-06-09 14:16 ` [PATCH 01/15] x86: use stack_start in x86_64 Glauber Costa
@ 2008-06-09 14:16   ` Glauber Costa
  2008-06-09 14:16     ` [PATCH 03/15] x86: remove early_gdt_descr reference Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

There's a macro available for that.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/traps_32.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/traps_32.c b/arch/x86/kernel/traps_32.c
index a92b0c7..9ad896d 100644
--- a/arch/x86/kernel/traps_32.c
+++ b/arch/x86/kernel/traps_32.c
@@ -1135,7 +1135,7 @@ void do_spurious_interrupt_bug(struct pt_regs *regs, long error_code)
 
 unsigned long patch_espfix_desc(unsigned long uesp, unsigned long kesp)
 {
-	struct desc_struct *gdt = __get_cpu_var(gdt_page).gdt;
+	struct desc_struct *gdt = get_cpu_gdt_table(smp_processor_id());
 	unsigned long base = (kesp - uesp) & -THREAD_SIZE;
 	unsigned long new_kesp = kesp - base;
 	unsigned long lim_pages = (new_kesp | (THREAD_SIZE - 1)) >> PAGE_SHIFT;
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/15] x86: remove early_gdt_descr reference
  2008-06-09 14:16   ` [PATCH 02/15] x86: don't use gdt_page openly Glauber Costa
@ 2008-06-09 14:16     ` Glauber Costa
  2008-06-09 14:16       ` [PATCH 04/15] x86: move x86_64 gdt closer to i386 Glauber Costa
  2008-06-09 15:23       ` [PATCH 03/15] x86: remove early_gdt_descr reference James Bottomley
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh, James Bottomley

since we use switch_to_new_gdt, there is no point
in assigning early_gdt_descr except for the first
assignment, which is done manually.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
CC: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 arch/x86/kernel/smpboot.c           |    1 -
 arch/x86/mach-voyager/voyager_smp.c |    1 -
 2 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c77a7d2..181b109 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -898,7 +898,6 @@ do_rest:
 #ifdef CONFIG_X86_32
 	per_cpu(current_task, cpu) = c_idle.idle;
 	init_gdt(cpu);
-	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
 	c_idle.idle->thread.ip = (unsigned long) start_secondary;
 	/* Stack for startup_32 can be just as for start_secondary onwards */
 	irq_ctx_init(cpu);
diff --git a/arch/x86/mach-voyager/voyager_smp.c b/arch/x86/mach-voyager/voyager_smp.c
index 8dedd01..6b97d9b 100644
--- a/arch/x86/mach-voyager/voyager_smp.c
+++ b/arch/x86/mach-voyager/voyager_smp.c
@@ -529,7 +529,6 @@ static void __init do_boot_cpu(__u8 cpu)
 
 	init_gdt(cpu);
 	per_cpu(current_task, cpu) = idle;
-	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
 	irq_ctx_init(cpu);
 
 	/* Note: Don't modify initial ss override */
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/15] x86: move x86_64 gdt closer to i386
  2008-06-09 14:16     ` [PATCH 03/15] x86: remove early_gdt_descr reference Glauber Costa
@ 2008-06-09 14:16       ` Glauber Costa
  2008-06-09 14:16         ` [PATCH 05/15] x86: use initial_code for i386 Glauber Costa
  2008-06-09 15:23       ` [PATCH 03/15] x86: remove early_gdt_descr reference James Bottomley
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

i386 and x86_64 used two different schemes for maintaining the gdt.
With this patch, x86_64 initial gdt table is defined in a .c file,
same way as i386 is now. Also, we call it "gdt_page", and the descriptor,
"early_gdt_descr". This way we achieve common naming, which can allow for
more code integration.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/head_64.S        |   48 ++++----------------------------------
 arch/x86/kernel/setup64.c        |    5 +---
 arch/x86/kernel/setup_64.c       |   27 +++++++++++++++++++++
 arch/x86/kernel/smpboot.c        |   10 +------
 arch/x86/kernel/x8664_ksyms_64.c |    5 ----
 include/asm-x86/desc.h           |   24 ++++++++-----------
 include/asm-x86/segment.h        |   23 +++++++++--------
 7 files changed, 57 insertions(+), 85 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 4949c32..febeabb 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -203,7 +203,7 @@ ENTRY(secondary_startup_64)
 	 * addresses where we're currently running on. We have to do that here
 	 * because in 32bit we couldn't load a 64bit linear address.
 	 */
-	lgdt	cpu_gdt_descr(%rip)
+	lgdt	early_gdt_descr(%rip)
 
 	/* set up data segments. actually 0 would do too */
 	movl $__KERNEL_DS,%eax
@@ -391,54 +391,16 @@ NEXT_PAGE(level2_spare_pgt)
 
 	.data
 	.align 16
-	.globl cpu_gdt_descr
-cpu_gdt_descr:
-	.word	gdt_end-cpu_gdt_table-1
-gdt:
-	.quad	cpu_gdt_table
-#ifdef CONFIG_SMP
-	.rept	NR_CPUS-1
-	.word	0
-	.quad	0
-	.endr
-#endif
+	.globl early_gdt_descr
+early_gdt_descr:
+	.word	GDT_ENTRIES*8-1
+	.quad   per_cpu__gdt_page
 
 ENTRY(phys_base)
 	/* This must match the first entry in level2_kernel_pgt */
 	.quad   0x0000000000000000
 
-/* We need valid kernel segments for data and code in long mode too
- * IRET will check the segment types  kkeil 2000/10/28
- * Also sysret mandates a special GDT layout 
- */
-		 		
-	.section .data.page_aligned, "aw"
-	.align PAGE_SIZE
-
-/* The TLS descriptors are currently at a different place compared to i386.
-   Hopefully nobody expects them at a fixed place (Wine?) */
 	
-ENTRY(cpu_gdt_table)
-	.quad	0x0000000000000000	/* NULL descriptor */
-	.quad	0x00cf9b000000ffff	/* __KERNEL32_CS */
-	.quad	0x00af9b000000ffff	/* __KERNEL_CS */
-	.quad	0x00cf93000000ffff	/* __KERNEL_DS */
-	.quad	0x00cffb000000ffff	/* __USER32_CS */
-	.quad	0x00cff3000000ffff	/* __USER_DS, __USER32_DS  */
-	.quad	0x00affb000000ffff	/* __USER_CS */
-	.quad	0x0			/* unused */
-	.quad	0,0			/* TSS */
-	.quad	0,0			/* LDT */
-	.quad   0,0,0			/* three TLS descriptors */ 
-	.quad	0x0000f40000000000	/* node/CPU stored in limit */
-gdt_end:	
-	/* asm/segment.h:GDT_ENTRIES must match this */	
-	/* This should be a multiple of the cache line size */
-	/* GDTs of other CPUs are now dynamically allocated */
-
-	/* zero the remaining page */
-	.fill PAGE_SIZE / 8 - GDT_ENTRIES,8,0
-
 	.section .bss, "aw", @nobits
 	.align L1_CACHE_BYTES
 ENTRY(idt_table)
diff --git a/arch/x86/kernel/setup64.c b/arch/x86/kernel/setup64.c
index fc1a56d..70ff071 100644
--- a/arch/x86/kernel/setup64.c
+++ b/arch/x86/kernel/setup64.c
@@ -202,11 +202,8 @@ void __cpuinit cpu_init (void)
 	 * Initialize the per-CPU GDT with the boot GDT,
 	 * and set up the GDT descriptor:
 	 */
-	if (cpu)
-		memcpy(get_cpu_gdt_table(cpu), cpu_gdt_table, GDT_SIZE);
 
-	cpu_gdt_descr[cpu].size = GDT_SIZE;
-	load_gdt((const struct desc_ptr *)&cpu_gdt_descr[cpu]);
+	switch_to_new_gdt();
 	load_idt((const struct desc_ptr *)&idt_descr);
 
 	memset(me->thread.tls_array, 0, GDT_ENTRY_TLS_ENTRIES * 8);
diff --git a/arch/x86/kernel/setup_64.c b/arch/x86/kernel/setup_64.c
index 2aee441..c57223f 100644
--- a/arch/x86/kernel/setup_64.c
+++ b/arch/x86/kernel/setup_64.c
@@ -83,6 +83,22 @@
 #define ARCH_SETUP
 #endif
 
+/* We need valid kernel segments for data and code in long mode too
+ * IRET will check the segment types  kkeil 2000/10/28
+ * Also sysret mandates a special GDT layout
+ */
+/* The TLS descriptors are currently at a different place compared to i386.
+   Hopefully nobody expects them at a fixed place (Wine?) */
+DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
+	[GDT_ENTRY_KERNEL32_CS] = { { { 0x0000ffff, 0x00cf9b00 } } },
+	[GDT_ENTRY_KERNEL_CS] = { { { 0x0000ffff, 0x00af9b00 } } },
+	[GDT_ENTRY_KERNEL_DS] = { { { 0x0000ffff, 0x00cf9300 } } },
+	[GDT_ENTRY_DEFAULT_USER32_CS] = { { { 0x0000ffff, 0x00cffb00 } } },
+	[GDT_ENTRY_DEFAULT_USER_DS] = { { { 0x0000ffff, 0x00cff300 } } },
+	[GDT_ENTRY_DEFAULT_USER_CS] = { { { 0x0000ffff, 0x00affb00 } } },
+} };
+EXPORT_PER_CPU_SYMBOL_GPL(gdt_page);
+
 /*
  * Machine setup..
  */
@@ -273,6 +289,17 @@ void __attribute__((weak)) __init memory_setup(void)
        machine_specific_memory_setup();
 }
 
+/* Current gdt points %fs at the "master" per-cpu area: after this,
+ * it's on the real one. */
+void switch_to_new_gdt(void)
+{
+	struct desc_ptr gdt_descr;
+
+	gdt_descr.address = (long)get_cpu_gdt_table(smp_processor_id());
+	gdt_descr.size = GDT_SIZE - 1;
+	load_gdt(&gdt_descr);
+}
+
 /*
  * setup_arch - architecture-specific boot-time initializations
  *
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 181b109..69639fd 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -849,14 +849,8 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu)
 		.done = COMPLETION_INITIALIZER_ONSTACK(c_idle.done),
 	};
 	INIT_WORK(&c_idle.work, do_fork_idle);
-#ifdef CONFIG_X86_64
-	/* allocate memory for gdts of secondary cpus. Hotplug is considered */
-	if (!cpu_gdt_descr[cpu].address &&
-		!(cpu_gdt_descr[cpu].address = get_zeroed_page(GFP_KERNEL))) {
-		printk(KERN_ERR "Failed to allocate GDT for CPU %d\n", cpu);
-		return -1;
-	}
 
+#ifdef CONFIG_X86_64
 	/* Allocate node local memory for AP pdas */
 	if (cpu > 0) {
 		boot_error = get_local_pda(cpu);
@@ -1253,8 +1247,8 @@ void __init native_smp_prepare_boot_cpu(void)
 	int me = smp_processor_id();
 #ifdef CONFIG_X86_32
 	init_gdt(me);
-	switch_to_new_gdt();
 #endif
+	switch_to_new_gdt();
 	/* already set me in cpu_online_map in boot_cpu_init() */
 	cpu_set(me, cpu_callout_map);
 	per_cpu(cpu_state, me) = CPU_ONLINE;
diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c
index 122885b..16a6751 100644
--- a/arch/x86/kernel/x8664_ksyms_64.c
+++ b/arch/x86/kernel/x8664_ksyms_64.c
@@ -60,8 +60,3 @@ EXPORT_SYMBOL(init_level4_pgt);
 EXPORT_SYMBOL(load_gs_index);
 
 EXPORT_SYMBOL(_proxy_pda);
-
-#ifdef CONFIG_PARAVIRT
-/* Virtualized guests may want to use it */
-EXPORT_SYMBOL_GPL(cpu_gdt_descr);
-#endif
diff --git a/include/asm-x86/desc.h b/include/asm-x86/desc.h
index b3875d4..07f9f2b 100644
--- a/include/asm-x86/desc.h
+++ b/include/asm-x86/desc.h
@@ -29,11 +29,17 @@ static inline void fill_ldt(struct desc_struct *desc,
 extern struct desc_ptr idt_descr;
 extern gate_desc idt_table[];
 
+struct gdt_page {
+	struct desc_struct gdt[GDT_ENTRIES];
+} __attribute__((aligned(PAGE_SIZE)));
+DECLARE_PER_CPU(struct gdt_page, gdt_page);
+
+static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu)
+{
+	return per_cpu(gdt_page, cpu).gdt;
+}
+
 #ifdef CONFIG_X86_64
-extern struct desc_struct cpu_gdt_table[GDT_ENTRIES];
-extern struct desc_ptr cpu_gdt_descr[];
-/* the cpu gdt accessor */
-#define get_cpu_gdt_table(x) ((struct desc_struct *)cpu_gdt_descr[x].address)
 
 static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long func,
 			     unsigned dpl, unsigned ist, unsigned seg)
@@ -51,16 +57,6 @@ static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long func,
 }
 
 #else
-struct gdt_page {
-	struct desc_struct gdt[GDT_ENTRIES];
-} __attribute__((aligned(PAGE_SIZE)));
-DECLARE_PER_CPU(struct gdt_page, gdt_page);
-
-static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu)
-{
-	return per_cpu(gdt_page, cpu).gdt;
-}
-
 static inline void pack_gate(gate_desc *gate, unsigned char type,
 			     unsigned long base, unsigned dpl, unsigned flags,
 			     unsigned short seg)
diff --git a/include/asm-x86/segment.h b/include/asm-x86/segment.h
index ed5131d..dfc8601 100644
--- a/include/asm-x86/segment.h
+++ b/include/asm-x86/segment.h
@@ -61,18 +61,14 @@
 #define GDT_ENTRY_TLS_MAX 	(GDT_ENTRY_TLS_MIN + GDT_ENTRY_TLS_ENTRIES - 1)
 
 #define GDT_ENTRY_DEFAULT_USER_CS	14
-#define __USER_CS (GDT_ENTRY_DEFAULT_USER_CS * 8 + 3)
 
 #define GDT_ENTRY_DEFAULT_USER_DS	15
-#define __USER_DS (GDT_ENTRY_DEFAULT_USER_DS * 8 + 3)
 
 #define GDT_ENTRY_KERNEL_BASE	12
 
 #define GDT_ENTRY_KERNEL_CS		(GDT_ENTRY_KERNEL_BASE + 0)
-#define __KERNEL_CS (GDT_ENTRY_KERNEL_CS * 8)
 
 #define GDT_ENTRY_KERNEL_DS		(GDT_ENTRY_KERNEL_BASE + 1)
-#define __KERNEL_DS (GDT_ENTRY_KERNEL_DS * 8)
 
 #define GDT_ENTRY_TSS			(GDT_ENTRY_KERNEL_BASE + 4)
 #define GDT_ENTRY_LDT			(GDT_ENTRY_KERNEL_BASE + 5)
@@ -139,10 +135,11 @@
 #else
 #include <asm/cache.h>
 
-#define __KERNEL_CS	0x10
-#define __KERNEL_DS	0x18
+#define GDT_ENTRY_KERNEL32_CS 1
+#define GDT_ENTRY_KERNEL_CS 2
+#define GDT_ENTRY_KERNEL_DS 3
 
-#define __KERNEL32_CS   0x08
+#define __KERNEL32_CS   (GDT_ENTRY_KERNEL32_CS * 8)
 
 /*
  * we cannot use the same code segment descriptor for user and kernel
@@ -150,10 +147,10 @@
  * The segment offset needs to contain a RPL. Grr. -AK
  * GDT layout to get 64bit syscall right (sysret hardcodes gdt offsets)
  */
-
-#define __USER32_CS   0x23   /* 4*8+3 */
-#define __USER_DS     0x2b   /* 5*8+3 */
-#define __USER_CS     0x33   /* 6*8+3 */
+#define GDT_ENTRY_DEFAULT_USER32_CS 4
+#define GDT_ENTRY_DEFAULT_USER_DS 5
+#define GDT_ENTRY_DEFAULT_USER_CS 6
+#define __USER32_CS   (GDT_ENTRY_DEFAULT_USER32_CS * 8 + 3)
 #define __USER32_DS	__USER_DS
 
 #define GDT_ENTRY_TSS 8	/* needs two entries */
@@ -175,6 +172,10 @@
 
 #endif
 
+#define __KERNEL_CS	(GDT_ENTRY_KERNEL_CS * 8)
+#define __KERNEL_DS	(GDT_ENTRY_KERNEL_DS * 8)
+#define __USER_DS     (GDT_ENTRY_DEFAULT_USER_DS* 8 + 3)
+#define __USER_CS     (GDT_ENTRY_DEFAULT_USER_CS* 8 + 3)
 #ifndef CONFIG_PARAVIRT
 #define get_kernel_rpl()  0
 #endif
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/15] x86: use initial_code for i386
  2008-06-09 14:16       ` [PATCH 04/15] x86: move x86_64 gdt closer to i386 Glauber Costa
@ 2008-06-09 14:16         ` Glauber Costa
  2008-06-09 14:16           ` [PATCH 06/15] x86: boot secondary cpus through initial_code Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

x86_64 jumps to whatever is written in "initial_code" symbol,
instead of a fixed address. Do it for i386 too. It will allow us
to integrate more of the smp boot code.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/head_32.S |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index b98b338..ffb73a5 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -455,7 +455,10 @@ is386:	movl $2,%ecx		# set MP
 	jmp initialize_secondary # all other CPUs call initialize_secondary
 1:
 #endif /* CONFIG_SMP */
-	jmp i386_start_kernel
+	jmp *(initial_code)
+.align 4
+ENTRY(initial_code)
+	.long i386_start_kernel
 
 /*
  * We depend on ET to be correct. This checks for 287/387.
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/15] x86: boot secondary cpus through initial_code
  2008-06-09 14:16         ` [PATCH 05/15] x86: use initial_code for i386 Glauber Costa
@ 2008-06-09 14:16           ` Glauber Costa
  2008-06-09 14:16             ` [PATCH 07/15] x86: clearing io_apic harmless for x86_64 Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

remove "initialize_secondary". Boot both architectures via
initial_code.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/head_32.S |    2 +-
 arch/x86/kernel/smpboot.c |   25 +------------------------
 2 files changed, 2 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index ffb73a5..f67e934 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -452,7 +452,7 @@ is386:	movl $2,%ecx		# set MP
 	je   1f
 	movl $(__KERNEL_PERCPU), %eax
 	movl %eax,%fs		# set this cpu's percpu
-	jmp initialize_secondary # all other CPUs call initialize_secondary
+	movl (stack_start), %esp
 1:
 #endif /* CONFIG_SMP */
 	jmp *(initial_code)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 69639fd..0e4a44a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -349,28 +349,6 @@ static void __cpuinit start_secondary(void *unused)
 	cpu_idle();
 }
 
-#ifdef CONFIG_X86_32
-/*
- * Everything has been set up for the secondary
- * CPUs - they just need to reload everything
- * from the task structure
- * This function must not return.
- */
-void __devinit initialize_secondary(void)
-{
-	/*
-	 * We don't actually need to load the full TSS,
-	 * basically just the stack pointer and the ip.
-	 */
-
-	asm volatile(
-		"movl %0,%%esp\n\t"
-		"jmp *%1"
-		:
-		:"m" (current->thread.sp), "m" (current->thread.ip));
-}
-#endif
-
 static void __cpuinit smp_apply_quirks(struct cpuinfo_x86 *c)
 {
 #ifdef CONFIG_X86_32
@@ -892,15 +870,14 @@ do_rest:
 #ifdef CONFIG_X86_32
 	per_cpu(current_task, cpu) = c_idle.idle;
 	init_gdt(cpu);
-	c_idle.idle->thread.ip = (unsigned long) start_secondary;
 	/* Stack for startup_32 can be just as for start_secondary onwards */
 	irq_ctx_init(cpu);
 #else
 	cpu_pda(cpu)->pcurrent = c_idle.idle;
 	load_sp0(&per_cpu(init_tss, cpu), &c_idle.idle->thread);
-	initial_code = (unsigned long)start_secondary;
 	clear_tsk_thread_flag(c_idle.idle, TIF_FORK);
 #endif
+	initial_code = (unsigned long)start_secondary;
 	stack_start.sp = (void *) c_idle.idle->thread.sp;
 
 	/* start_ip had better be page-aligned! */
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/15] x86: clearing io_apic harmless for x86_64
  2008-06-09 14:16           ` [PATCH 06/15] x86: boot secondary cpus through initial_code Glauber Costa
@ 2008-06-09 14:16             ` Glauber Costa
  2008-06-09 14:16               ` [PATCH 08/15] x86: remove ifdef from stepping Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

so remove ifdef.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/smpboot.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 0e4a44a..5a055a1 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1052,9 +1052,8 @@ static __init void disable_smp(void)
 {
 	cpu_present_map = cpumask_of_cpu(0);
 	cpu_possible_map = cpumask_of_cpu(0);
-#ifdef CONFIG_X86_32
 	smpboot_clear_io_apic_irqs();
-#endif
+
 	if (smp_found_config)
 		phys_cpu_present_map =
 				physid_mask_of_physid(boot_cpu_physical_apicid);
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/15] x86: remove ifdef from stepping
  2008-06-09 14:16             ` [PATCH 07/15] x86: clearing io_apic harmless for x86_64 Glauber Costa
@ 2008-06-09 14:16               ` Glauber Costa
  2008-06-09 14:16                 ` [PATCH 09/15] x86: change __setup_vector_irq with setup_vector_irq Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

The stepping won't affect x86_64, since there are not x86_64 k7's
or pentiums. So, although it adds to the binary size, remove the ifdef
for smoother integration

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/smpboot.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5a055a1..50d307e 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -351,7 +351,6 @@ static void __cpuinit start_secondary(void *unused)
 
 static void __cpuinit smp_apply_quirks(struct cpuinfo_x86 *c)
 {
-#ifdef CONFIG_X86_32
 	/*
 	 * Mask B, Pentium, but not Pentium MMX
 	 */
@@ -401,7 +400,6 @@ static void __cpuinit smp_apply_quirks(struct cpuinfo_x86 *c)
 
 valid_k7:
 	;
-#endif
 }
 
 static void __cpuinit smp_checks(void)
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/15] x86: change __setup_vector_irq with setup_vector_irq
  2008-06-09 14:16               ` [PATCH 08/15] x86: remove ifdef from stepping Glauber Costa
@ 2008-06-09 14:16                 ` Glauber Costa
  2008-06-09 14:16                   ` [PATCH 10/15] x86: provide connect_bsp_APIC for x86_64 Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

We create a version of it for i386, and then take the CONFIG_X86_64
ifdef out of the game. We could create a __setup_vector_irq for i386,
but it would incur in an unnecessary lock taking. Moreover, it is better
practice to only export setup_vector_irq anyway.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/io_apic_32.c |    5 +++++
 arch/x86/kernel/io_apic_64.c |    9 ++++++++-
 arch/x86/kernel/smpboot.c    |   11 ++---------
 include/asm-x86/hw_irq.h     |    2 +-
 4 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/io_apic_32.c b/arch/x86/kernel/io_apic_32.c
index 9dbe7e9..7fc071f 100644
--- a/arch/x86/kernel/io_apic_32.c
+++ b/arch/x86/kernel/io_apic_32.c
@@ -1209,6 +1209,11 @@ static int assign_irq_vector(int irq)
 
 	return vector;
 }
+
+void setup_vector_irq(int cpu)
+{
+}
+
 static struct irq_chip ioapic_chip;
 
 #define IOAPIC_AUTO	-1
diff --git a/arch/x86/kernel/io_apic_64.c b/arch/x86/kernel/io_apic_64.c
index 5ea376d..97d67c9 100644
--- a/arch/x86/kernel/io_apic_64.c
+++ b/arch/x86/kernel/io_apic_64.c
@@ -780,7 +780,7 @@ static void __clear_irq_vector(int irq)
 	cpus_clear(cfg->domain);
 }
 
-void __setup_vector_irq(int cpu)
+static void __setup_vector_irq(int cpu)
 {
 	/* Initialize vector_irq on a new cpu */
 	/* This function must be called with vector_lock held */
@@ -803,6 +803,13 @@ void __setup_vector_irq(int cpu)
 	}
 }
 
+void setup_vector_irq(int cpu)
+{
+	spin_lock(&vector_lock);
+	__setup_vector_irq(smp_processor_id());
+	spin_unlock(&vector_lock);
+}
+
 
 static struct irq_chip ioapic_chip;
 
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 50d307e..9d8dfec 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -329,15 +329,8 @@ static void __cpuinit start_secondary(void *unused)
 	 * smp_call_function().
 	 */
 	lock_ipi_call_lock();
-#ifdef CONFIG_X86_64
-	spin_lock(&vector_lock);
-
-	/* Setup the per cpu irq handling data structures */
-	__setup_vector_irq(smp_processor_id());
-	/*
-	 * Allow the master to continue.
-	 */
-	spin_unlock(&vector_lock);
+#ifdef CONFIG_X86_IO_APIC
+	setup_vector_irq(smp_processor_id());
 #endif
 	cpu_set(smp_processor_id(), cpu_online_map);
 	unlock_ipi_call_lock();
diff --git a/include/asm-x86/hw_irq.h b/include/asm-x86/hw_irq.h
index 1428b41..18f067c 100644
--- a/include/asm-x86/hw_irq.h
+++ b/include/asm-x86/hw_irq.h
@@ -97,9 +97,9 @@ extern void (*const interrupt[NR_IRQS])(void);
 #else
 typedef int vector_irq_t[NR_VECTORS];
 DECLARE_PER_CPU(vector_irq_t, vector_irq);
-extern void __setup_vector_irq(int cpu);
 extern spinlock_t vector_lock;
 #endif
+extern void setup_vector_irq(int cpu);
 
 #endif /* !ASSEMBLY_ */
 
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/15] x86: provide connect_bsp_APIC for x86_64
  2008-06-09 14:16                 ` [PATCH 09/15] x86: change __setup_vector_irq with setup_vector_irq Glauber Costa
@ 2008-06-09 14:16                   ` Glauber Costa
  2008-06-09 14:16                     ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

Although it is not really needed, we provide it to get
closer to i386. ifdefs around it are removed in smpboot.c

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/apic_64.c |   10 ++++++++++
 arch/x86/kernel/smpboot.c |    5 +----
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/apic_64.c b/arch/x86/kernel/apic_64.c
index 310bec1..8d62387 100644
--- a/arch/x86/kernel/apic_64.c
+++ b/arch/x86/kernel/apic_64.c
@@ -942,6 +942,8 @@ int __init APIC_init_uniprocessor(void)
 
 	verify_local_APIC();
 
+	connect_bsp_APIC();
+
 	phys_cpu_present_map = physid_mask_of_physid(boot_cpu_physical_apicid);
 	apic_write(APIC_ID, SET_APIC_ID(boot_cpu_physical_apicid));
 
@@ -1023,6 +1025,14 @@ asmlinkage void smp_error_interrupt(void)
 	irq_exit();
 }
 
+/**
+ *  * connect_bsp_APIC - attach the APIC to the interrupt system
+ *   */
+void __init connect_bsp_APIC(void)
+{
+	enable_apic_mode();
+}
+
 void disconnect_bsp_APIC(int virt_wire_setup)
 {
 	/* Go back to Virtual Wire compatibility mode */
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 9d8dfec..c4bb997 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1117,9 +1117,7 @@ static int __init smp_sanity_check(unsigned max_cpus)
 
 		localise_nmi_watchdog();
 
-#ifdef CONFIG_X86_32
 		connect_bsp_APIC();
-#endif
 		setup_local_APIC();
 		end_local_APIC_setup();
 		return -1;
@@ -1174,9 +1172,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	}
 	preempt_enable();
 
-#ifdef CONFIG_X86_32
 	connect_bsp_APIC();
-#endif
+
 	/*
 	 * Switch from PIC to APIC mode.
 	 */
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 14:16                   ` [PATCH 10/15] x86: provide connect_bsp_APIC for x86_64 Glauber Costa
@ 2008-06-09 14:16                     ` Glauber Costa
  2008-06-09 14:16                       ` [PATCH 12/15] x86: change naming to match x86_64 Glauber Costa
  2008-06-09 15:23                       ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Maciej W. Rozycki
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

Do it, instead of keeping in io_apic_32.c. This is the way x86_64 already does.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/apic_32.c    |    6 +-----
 arch/x86/kernel/io_apic_32.c |    2 +-
 arch/x86/kernel/smpboot.c    |    3 ++-
 include/asm-x86/hw_irq.h     |    3 ---
 4 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/apic_32.c b/arch/x86/kernel/apic_32.c
index fa8cf79..a06442a 100644
--- a/arch/x86/kernel/apic_32.c
+++ b/arch/x86/kernel/apic_32.c
@@ -1277,11 +1277,7 @@ int __init APIC_init_uniprocessor(void)
 #endif
 		localise_nmi_watchdog();
 	end_local_APIC_setup();
-#ifdef CONFIG_X86_IO_APIC
-	if (smp_found_config)
-		if (!skip_ioapic_setup && nr_ioapics)
-			setup_IO_APIC();
-#endif
+
 	setup_boot_clock();
 
 	return 0;
diff --git a/arch/x86/kernel/io_apic_32.c b/arch/x86/kernel/io_apic_32.c
index 7fc071f..cb79ce3 100644
--- a/arch/x86/kernel/io_apic_32.c
+++ b/arch/x86/kernel/io_apic_32.c
@@ -1606,7 +1606,7 @@ void /*__init*/ print_PIC(void)
 
 #endif  /*  0  */
 
-static void __init enable_IO_APIC(void)
+void __init enable_IO_APIC(void)
 {
 	union IO_APIC_reg_01 reg_01;
 	int i8259_apic, i8259_pin;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c4bb997..49d2900 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1179,13 +1179,14 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	 */
 	setup_local_APIC();
 
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_X86_IO_APIC
 	/*
 	 * Enable IO APIC before setting up error vector
 	 */
 	if (!skip_ioapic_setup && nr_ioapics)
 		enable_IO_APIC();
 #endif
+
 	end_local_APIC_setup();
 
 	map_cpu_to_logical_apicid();
diff --git a/include/asm-x86/hw_irq.h b/include/asm-x86/hw_irq.h
index 18f067c..31b56cd 100644
--- a/include/asm-x86/hw_irq.h
+++ b/include/asm-x86/hw_irq.h
@@ -66,10 +66,7 @@ extern void disable_IO_APIC(void);
 extern void print_IO_APIC(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int slot, int fn);
 extern void setup_ioapic_dest(void);
-
-#ifdef CONFIG_X86_64
 extern void enable_IO_APIC(void);
-#endif
 
 /* IPI functions */
 extern void send_IPI_self(int vector);
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/15] x86: change naming to match x86_64
  2008-06-09 14:16                     ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Glauber Costa
@ 2008-06-09 14:16                       ` Glauber Costa
  2008-06-09 14:16                         ` [PATCH 13/15] x86: remove cpu from maps Glauber Costa
  2008-06-09 15:23                       ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Maciej W. Rozycki
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh, Glauber de Oliveira Costa

Change unmap_cpu_to_logical_apicid to numa_remove_cpu.
Besides being shorter, it is the same name x86_64 uses. We
can save an ifdef in the code this way.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
---
 arch/x86/kernel/smpboot.c |    8 ++------
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 49d2900..6316d30 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -181,13 +181,12 @@ static void map_cpu_to_logical_apicid(void)
 	map_cpu_to_node(cpu, node);
 }
 
-static void unmap_cpu_to_logical_apicid(int cpu)
+static void numa_remove_cpu(int cpu)
 {
 	cpu_2_logical_apicid[cpu] = BAD_APICID;
 	unmap_cpu_to_node(cpu);
 }
 #else
-#define unmap_cpu_to_logical_apicid(cpu) do {} while (0)
 #define map_cpu_to_logical_apicid()  do {} while (0)
 #endif
 
@@ -945,10 +944,7 @@ restore_state:
 
 	if (boot_error) {
 		/* Try to put things back the way they were before ... */
-		unmap_cpu_to_logical_apicid(cpu);
-#ifdef CONFIG_X86_64
 		numa_remove_cpu(cpu); /* was set by numa_add_cpu */
-#endif
 		cpu_clear(cpu, cpu_callout_map); /* was set by do_boot_cpu() */
 		cpu_clear(cpu, cpu_initialized); /* was set by cpu_init() */
 		cpu_clear(cpu, cpu_possible_map);
@@ -1246,7 +1242,7 @@ void cpu_exit_clear(void)
 	cpu_clear(cpu, cpu_callout_map);
 	cpu_clear(cpu, cpu_callin_map);
 
-	unmap_cpu_to_logical_apicid(cpu);
+	numa_remove_cpu(cpu);
 }
 #  endif /* CONFIG_X86_32 */
 
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/15] x86: remove cpu from maps
  2008-06-09 14:16                       ` [PATCH 12/15] x86: change naming to match x86_64 Glauber Costa
@ 2008-06-09 14:16                         ` Glauber Costa
  2008-06-09 14:16                           ` [PATCH 14/15] x86: move cpu_exit_clear to process_32.c Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

during cpu disable, take cpus out of all maps in i386, instead
of just the online map.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/smpboot.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 6316d30..f6b99b6 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1321,13 +1321,11 @@ __init void prefill_possible_map(void)
 static void __ref remove_cpu_from_maps(int cpu)
 {
 	cpu_clear(cpu, cpu_online_map);
-#ifdef CONFIG_X86_64
 	cpu_clear(cpu, cpu_callout_map);
 	cpu_clear(cpu, cpu_callin_map);
 	/* was set by cpu_init() */
 	clear_bit(cpu, (unsigned long *)&cpu_initialized);
 	numa_remove_cpu(cpu);
-#endif
 }
 
 int __cpu_disable(void)
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/15] x86: move cpu_exit_clear to process_32.c
  2008-06-09 14:16                         ` [PATCH 13/15] x86: remove cpu from maps Glauber Costa
@ 2008-06-09 14:16                           ` Glauber Costa
  2008-06-09 14:16                             ` [PATCH 15/15] x86: take load_sp0 out of smpboot.c Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

Take it out of smpboot.c, and move it to process_32.c, closer
to its only user.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/process_32.c |   16 ++++++++++++++++
 arch/x86/kernel/smpboot.c    |   19 +------------------
 include/asm-x86/numa_32.h    |    1 +
 include/asm-x86/smp.h        |    1 -
 4 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 1aaca76..99f8e7d 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -128,6 +128,22 @@ EXPORT_SYMBOL(default_idle);
 
 #ifdef CONFIG_HOTPLUG_CPU
 #include <asm/nmi.h>
+
+static void cpu_exit_clear(void)
+{
+	int cpu = raw_smp_processor_id();
+
+	idle_task_exit();
+
+	cpu_uninit();
+	irq_ctx_exit(cpu);
+
+	cpu_clear(cpu, cpu_callout_map);
+	cpu_clear(cpu, cpu_callin_map);
+
+	numa_remove_cpu(cpu);
+}
+
 /* We don't actually take CPU down, just spin without interrupts. */
 static inline void play_dead(void)
 {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f6b99b6..7952c39 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -181,7 +181,7 @@ static void map_cpu_to_logical_apicid(void)
 	map_cpu_to_node(cpu, node);
 }
 
-static void numa_remove_cpu(int cpu)
+void numa_remove_cpu(int cpu)
 {
 	cpu_2_logical_apicid[cpu] = BAD_APICID;
 	unmap_cpu_to_node(cpu);
@@ -1229,23 +1229,6 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 
 #ifdef CONFIG_HOTPLUG_CPU
 
-#  ifdef CONFIG_X86_32
-void cpu_exit_clear(void)
-{
-	int cpu = raw_smp_processor_id();
-
-	idle_task_exit();
-
-	cpu_uninit();
-	irq_ctx_exit(cpu);
-
-	cpu_clear(cpu, cpu_callout_map);
-	cpu_clear(cpu, cpu_callin_map);
-
-	numa_remove_cpu(cpu);
-}
-#  endif /* CONFIG_X86_32 */
-
 static void remove_siblinginfo(int cpu)
 {
 	int sibling;
diff --git a/include/asm-x86/numa_32.h b/include/asm-x86/numa_32.h
index 03d0f7a..205015f 100644
--- a/include/asm-x86/numa_32.h
+++ b/include/asm-x86/numa_32.h
@@ -2,6 +2,7 @@
 #define _ASM_X86_32_NUMA_H 1
 
 extern int pxm_to_nid(int pxm);
+extern void numa_remove_cpu(int cpu);
 
 #ifdef CONFIG_NUMA
 extern void __init remap_numa_kva(void);
diff --git a/include/asm-x86/smp.h b/include/asm-x86/smp.h
index fc10073..fad45f6 100644
--- a/include/asm-x86/smp.h
+++ b/include/asm-x86/smp.h
@@ -188,7 +188,6 @@ static inline int hard_smp_processor_id(void)
 #endif /* CONFIG_X86_LOCAL_APIC */
 
 #ifdef CONFIG_HOTPLUG_CPU
-extern void cpu_exit_clear(void);
 extern void cpu_uninit(void);
 #endif
 
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/15] x86: take load_sp0 out of smpboot.c
  2008-06-09 14:16                           ` [PATCH 14/15] x86: move cpu_exit_clear to process_32.c Glauber Costa
@ 2008-06-09 14:16                             ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, tglx, mingo, hugh

there's no particular reason to do load_sp0 in different
places for i386 and x86_64. They should all be in cpu_init.
Right now, cpu_init itself is not integrated, but with this patch,
the code becomes closer to each other, making in easier to integrate
when the time comes.

Furthermore, although doing it in do_boot_cpu for x86_64 is fine, since it's
only a copy, load_sp0 should be executed in the cpu it refers to anyway.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/setup64.c |    1 +
 arch/x86/kernel/smpboot.c |    1 -
 2 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/setup64.c b/arch/x86/kernel/setup64.c
index 70ff071..151d315 100644
--- a/arch/x86/kernel/setup64.c
+++ b/arch/x86/kernel/setup64.c
@@ -247,6 +247,7 @@ void __cpuinit cpu_init (void)
 		BUG();
 	enter_lazy_tlb(&init_mm, me);
 
+	load_sp0(t, &current->thread);
 	set_tss_desc(cpu, t);
 	load_TR_desc();
 	load_LDT(&init_mm.context);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7952c39..371113a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -864,7 +864,6 @@ do_rest:
 	irq_ctx_init(cpu);
 #else
 	cpu_pda(cpu)->pcurrent = c_idle.idle;
-	load_sp0(&per_cpu(init_tss, cpu), &c_idle.idle->thread);
 	clear_tsk_thread_flag(c_idle.idle, TIF_FORK);
 #endif
 	initial_code = (unsigned long)start_secondary;
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/15] x86: remove early_gdt_descr reference
  2008-06-09 14:16     ` [PATCH 03/15] x86: remove early_gdt_descr reference Glauber Costa
  2008-06-09 14:16       ` [PATCH 04/15] x86: move x86_64 gdt closer to i386 Glauber Costa
@ 2008-06-09 15:23       ` James Bottomley
  2008-06-09 15:49         ` Glauber Costa
  1 sibling, 1 reply; 38+ messages in thread
From: James Bottomley @ 2008-06-09 15:23 UTC (permalink / raw)
  To: Glauber Costa; +Cc: linux-kernel, akpm, tglx, mingo, hugh

On Mon, 2008-06-09 at 11:16 -0300, Glauber Costa wrote:
> since we use switch_to_new_gdt, there is no point
> in assigning early_gdt_descr except for the first
> assignment, which is done manually.

What makes you think you can do this?  If you don't update the early
boot gdt, they all end up using the Boot CPU one.  The problem with this
is that there's a time from start_secondary to switch_to_new_gdt where
the per cpu selector (%fs) and the pda selector (%gs) are those of the
boot CPU.   The former isn't a problem but the CPU number is in the
latter, and it's used in that path before we get to the initialisation.

James



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 14:16                     ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Glauber Costa
  2008-06-09 14:16                       ` [PATCH 12/15] x86: change naming to match x86_64 Glauber Costa
@ 2008-06-09 15:23                       ` Maciej W. Rozycki
  2008-06-09 15:52                         ` Glauber Costa
  2008-06-09 19:44                         ` Glauber Costa
  1 sibling, 2 replies; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-09 15:23 UTC (permalink / raw)
  To: Glauber Costa; +Cc: linux-kernel, akpm, tglx, mingo, hugh

On Mon, 9 Jun 2008, Glauber Costa wrote:

> Do it, instead of keeping in io_apic_32.c. This is the way x86_64
> already does.

 This change looks wrong -- is native_smp_prepare_cpus() at all run on
!CONFIG_SMP?  I don't think so.  You have to do this differently.

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/15] x86: remove early_gdt_descr reference
  2008-06-09 15:23       ` [PATCH 03/15] x86: remove early_gdt_descr reference James Bottomley
@ 2008-06-09 15:49         ` Glauber Costa
  2008-06-09 17:20           ` James Bottomley
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 15:49 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel, akpm, tglx, mingo, hugh

James Bottomley wrote:
> On Mon, 2008-06-09 at 11:16 -0300, Glauber Costa wrote:
>> since we use switch_to_new_gdt, there is no point
>> in assigning early_gdt_descr except for the first
>> assignment, which is done manually.
> 
> What makes you think you can do this?  If you don't update the early
> boot gdt, they all end up using the Boot CPU one.  The problem with this
> is that there's a time from start_secondary to switch_to_new_gdt where
> the per cpu selector (%fs) and the pda selector (%gs) are those of the
> boot CPU.   The former isn't a problem but the CPU number is in the
> latter, and it's used in that path before we get to the initialisation.

You are right, I missed it.

However, it only seem to be used in cpu_init, and very early. Sure there 
are some users _before_ we load the new gdt, but nothing prevents them 
to be moved after it. (Of course, this patch is wrong anyway).

And if we do that, we can even take the %fs loading out of head_32.S
Of course, it's only valid if those are indeed the only early users of it.

Is there any other use I'm missing?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 15:23                       ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Maciej W. Rozycki
@ 2008-06-09 15:52                         ` Glauber Costa
  2008-06-09 19:44                         ` Glauber Costa
  1 sibling, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 15:52 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-kernel, akpm, tglx, mingo, hugh

Maciej W. Rozycki wrote:
> On Mon, 9 Jun 2008, Glauber Costa wrote:
> 
>> Do it, instead of keeping in io_apic_32.c. This is the way x86_64
>> already does.
> 
>  This change looks wrong -- is native_smp_prepare_cpus() at all run on
> !CONFIG_SMP?  I don't think so.  You have to do this differently.
You're right.

I'll update this one.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/15] x86: remove early_gdt_descr reference
  2008-06-09 15:49         ` Glauber Costa
@ 2008-06-09 17:20           ` James Bottomley
  2008-06-09 17:23             ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: James Bottomley @ 2008-06-09 17:20 UTC (permalink / raw)
  To: Glauber Costa; +Cc: linux-kernel, akpm, tglx, mingo, hugh

On Mon, 2008-06-09 at 12:49 -0300, Glauber Costa wrote:
> James Bottomley wrote:
> > On Mon, 2008-06-09 at 11:16 -0300, Glauber Costa wrote:
> >> since we use switch_to_new_gdt, there is no point
> >> in assigning early_gdt_descr except for the first
> >> assignment, which is done manually.
> > 
> > What makes you think you can do this?  If you don't update the early
> > boot gdt, they all end up using the Boot CPU one.  The problem with this
> > is that there's a time from start_secondary to switch_to_new_gdt where
> > the per cpu selector (%fs) and the pda selector (%gs) are those of the
> > boot CPU.   The former isn't a problem but the CPU number is in the
> > latter, and it's used in that path before we get to the initialisation.
> 
> You are right, I missed it.
> 
> However, it only seem to be used in cpu_init, and very early. Sure there 
> are some users _before_ we load the new gdt, but nothing prevents them 
> to be moved after it. (Of course, this patch is wrong anyway).
> 
> And if we do that, we can even take the %fs loading out of head_32.S
> Of course, it's only valid if those are indeed the only early users of it.
> 
> Is there any other use I'm missing?

Well, %fs loading there is done for the boot CPU. To eliminate that you
have to not only verify that start_secondary doesn't use anything in
per_cpu areas, but also verify that nothing in start_kernel() up until
boot_cpu_init() does ... That's a lot of smp_processor_id() references
to convert.

James



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/15] x86: remove early_gdt_descr reference
  2008-06-09 17:20           ` James Bottomley
@ 2008-06-09 17:23             ` Glauber Costa
  2008-06-09 17:40               ` James Bottomley
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 17:23 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel, akpm, tglx, mingo, hugh

James Bottomley wrote:
> On Mon, 2008-06-09 at 12:49 -0300, Glauber Costa wrote:
>> James Bottomley wrote:
>>> On Mon, 2008-06-09 at 11:16 -0300, Glauber Costa wrote:
>>>> since we use switch_to_new_gdt, there is no point
>>>> in assigning early_gdt_descr except for the first
>>>> assignment, which is done manually.
>>> What makes you think you can do this?  If you don't update the early
>>> boot gdt, they all end up using the Boot CPU one.  The problem with this
>>> is that there's a time from start_secondary to switch_to_new_gdt where
>>> the per cpu selector (%fs) and the pda selector (%gs) are those of the
>>> boot CPU.   The former isn't a problem but the CPU number is in the
>>> latter, and it's used in that path before we get to the initialisation.
>> You are right, I missed it.
>>
>> However, it only seem to be used in cpu_init, and very early. Sure there 
>> are some users _before_ we load the new gdt, but nothing prevents them 
>> to be moved after it. (Of course, this patch is wrong anyway).
>>
>> And if we do that, we can even take the %fs loading out of head_32.S
>> Of course, it's only valid if those are indeed the only early users of it.
>>
>> Is there any other use I'm missing?
> 
> Well, %fs loading there is done for the boot CPU. To eliminate that you
> have to not only verify that start_secondary doesn't use anything in
> per_cpu areas, but also verify that nothing in start_kernel() up until
> boot_cpu_init() does ... That's a lot of smp_processor_id() references
> to convert.
Yes, after a second look, it would be tricky indeed. But only for cpu0.
For all the others, I still think we could get rid of the problem by 
switching to the new gdt earlier in cpu_init.

What do you think?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/15] x86: remove early_gdt_descr reference
  2008-06-09 17:23             ` Glauber Costa
@ 2008-06-09 17:40               ` James Bottomley
  0 siblings, 0 replies; 38+ messages in thread
From: James Bottomley @ 2008-06-09 17:40 UTC (permalink / raw)
  To: Glauber Costa; +Cc: linux-kernel, akpm, tglx, mingo, hugh

On Mon, 2008-06-09 at 14:23 -0300, Glauber Costa wrote:
> James Bottomley wrote:
> > On Mon, 2008-06-09 at 12:49 -0300, Glauber Costa wrote:
> >> James Bottomley wrote:
> >>> On Mon, 2008-06-09 at 11:16 -0300, Glauber Costa wrote:
> >>>> since we use switch_to_new_gdt, there is no point
> >>>> in assigning early_gdt_descr except for the first
> >>>> assignment, which is done manually.
> >>> What makes you think you can do this?  If you don't update the early
> >>> boot gdt, they all end up using the Boot CPU one.  The problem with this
> >>> is that there's a time from start_secondary to switch_to_new_gdt where
> >>> the per cpu selector (%fs) and the pda selector (%gs) are those of the
> >>> boot CPU.   The former isn't a problem but the CPU number is in the
> >>> latter, and it's used in that path before we get to the initialisation.
> >> You are right, I missed it.
> >>
> >> However, it only seem to be used in cpu_init, and very early. Sure there 
> >> are some users _before_ we load the new gdt, but nothing prevents them 
> >> to be moved after it. (Of course, this patch is wrong anyway).
> >>
> >> And if we do that, we can even take the %fs loading out of head_32.S
> >> Of course, it's only valid if those are indeed the only early users of it.
> >>
> >> Is there any other use I'm missing?
> > 
> > Well, %fs loading there is done for the boot CPU. To eliminate that you
> > have to not only verify that start_secondary doesn't use anything in
> > per_cpu areas, but also verify that nothing in start_kernel() up until
> > boot_cpu_init() does ... That's a lot of smp_processor_id() references
> > to convert.
> Yes, after a second look, it would be tricky indeed. But only for cpu0.
> For all the others, I still think we could get rid of the problem by 
> switching to the new gdt earlier in cpu_init.
> 
> What do you think?

Operating a CPU with a bogus GDT is very fragile.  You can fix all the
current issues with the secondary CPUs, but it gives a critical section
within which none of the per_cpu operations will work.  It only takes
one patch violating this rule and we have a very subtle bug introduced.

It looks to me like the better fix might be just to initialise the gdt
completely and properly in do_boot_cpu and just have the single switch
in head_32.S be the correct one.  That way there's no problem critical
region.

James



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 15:23                       ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Maciej W. Rozycki
  2008-06-09 15:52                         ` Glauber Costa
@ 2008-06-09 19:44                         ` Glauber Costa
  2008-06-09 20:12                           ` Maciej W. Rozycki
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 19:44 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-kernel, akpm, tglx, mingo, hugh

[-- Attachment #1: Type: text/plain, Size: 443 bytes --]

Maciej W. Rozycki wrote:
> On Mon, 9 Jun 2008, Glauber Costa wrote:
> 
>> Do it, instead of keeping in io_apic_32.c. This is the way x86_64
>> already does.
> 
>  This change looks wrong -- is native_smp_prepare_cpus() at all run on
> !CONFIG_SMP?  I don't think so.  You have to do this differently.
> 
>   Maciej

How about this one instead?

This one does enable_IO_APIC in the init_uniprocessor too, and should 
account for the !smp case.

[-- Attachment #2: 0001-x86-move-enabling-of-io_apic-to-prepare_cpus.patch --]
[-- Type: text/x-patch, Size: 2526 bytes --]

>From 87f8366012e1dc3cf7212e241b0aa552d7825338 Mon Sep 17 00:00:00 2001
From: Glauber Costa <gcosta@redhat.com>
Date: Wed, 28 May 2008 17:04:12 -0300
Subject: [PATCH] x86: move enabling of io_apic to prepare_cpus

Do it, instead of keeping in io_apic_32.c. This is the way x86_64 already does.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
---
 arch/x86/kernel/apic_32.c    |    3 +++
 arch/x86/kernel/io_apic_32.c |    4 ++--
 arch/x86/kernel/smpboot.c    |    2 +-
 include/asm-x86/hw_irq.h     |    2 --
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/apic_32.c b/arch/x86/kernel/apic_32.c
index fa8cf79..8385dc6 100644
--- a/arch/x86/kernel/apic_32.c
+++ b/arch/x86/kernel/apic_32.c
@@ -1273,6 +1273,9 @@ int __init APIC_init_uniprocessor(void)
 	setup_local_APIC();
 
 #ifdef CONFIG_X86_IO_APIC
+	if (!skip_ioapic_setup && nr_ioapics)
+		enable_IO_APIC();
+
 	if (!smp_found_config || skip_ioapic_setup || !nr_ioapics)
 #endif
 		localise_nmi_watchdog();
diff --git a/arch/x86/kernel/io_apic_32.c b/arch/x86/kernel/io_apic_32.c
index 7fc071f..fa4f0a0 100644
--- a/arch/x86/kernel/io_apic_32.c
+++ b/arch/x86/kernel/io_apic_32.c
@@ -1606,7 +1606,7 @@ void /*__init*/ print_PIC(void)
 
 #endif  /*  0  */
 
-static void __init enable_IO_APIC(void)
+void __init enable_IO_APIC(void)
 {
 	union IO_APIC_reg_01 reg_01;
 	int i8259_apic, i8259_pin;
@@ -2295,7 +2295,7 @@ void __init setup_IO_APIC(void)
 	for (i = first_system_vector; i < NR_VECTORS; i++)
 		set_bit(i, used_vectors);
 
-	enable_IO_APIC();
+	/* calling enable_IO_APIC() is moved to setup_local_APIC for BP */
 
 	if (acpi_ioapic)
 		io_apic_irqs = ~0;	/* all IRQs go through IOAPIC */
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f5ac5b2..e537c7b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1175,7 +1175,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	 */
 	setup_local_APIC();
 
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_X86_IO_APIC
 	/*
 	 * Enable IO APIC before setting up error vector
 	 */
diff --git a/include/asm-x86/hw_irq.h b/include/asm-x86/hw_irq.h
index 18f067c..4c81d1b 100644
--- a/include/asm-x86/hw_irq.h
+++ b/include/asm-x86/hw_irq.h
@@ -67,9 +67,7 @@ extern void print_IO_APIC(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int slot, int fn);
 extern void setup_ioapic_dest(void);
 
-#ifdef CONFIG_X86_64
 extern void enable_IO_APIC(void);
-#endif
 
 /* IPI functions */
 extern void send_IPI_self(int vector);
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 19:44                         ` Glauber Costa
@ 2008-06-09 20:12                           ` Maciej W. Rozycki
  2008-06-09 20:53                             ` Yinghai Lu
  2008-06-09 21:02                             ` Glauber Costa
  0 siblings, 2 replies; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-09 20:12 UTC (permalink / raw)
  To: Glauber Costa; +Cc: linux-kernel, akpm, tglx, mingo, hugh

On Mon, 9 Jun 2008, Glauber Costa wrote:

> This one does enable_IO_APIC in the init_uniprocessor too, and should 
> account for the !smp case.

 Hmm, it looks a little bit better, but why do you want to call
enable_IO_APIC() separately in the first place?  There is a comment
stating: "Enable IO APIC before setting up error vector," but why is it
needed on 64-bit systems?  Especially as the very same system may run a
32-bit kernel and then it suddenly would not have to do this anymore?  
Strange...

 Also since you are cleaning up this code -- why don't you actually take
the opportunity and get rid of the horrible #ifdefs interspersed
throughout?

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 20:12                           ` Maciej W. Rozycki
@ 2008-06-09 20:53                             ` Yinghai Lu
  2008-06-09 21:00                               ` Maciej W. Rozycki
  2008-06-09 21:02                             ` Glauber Costa
  1 sibling, 1 reply; 38+ messages in thread
From: Yinghai Lu @ 2008-06-09 20:53 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Mon, Jun 9, 2008 at 1:12 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> On Mon, 9 Jun 2008, Glauber Costa wrote:
>
>> This one does enable_IO_APIC in the init_uniprocessor too, and should
>> account for the !smp case.
>
>  Hmm, it looks a little bit better, but why do you want to call
> enable_IO_APIC() separately in the first place?  There is a comment
> stating: "Enable IO APIC before setting up error vector," but why is it
> needed on 64-bit systems?  Especially as the very same system may run a
> 32-bit kernel and then it suddenly would not have to do this anymore?
> Strange...

that enable_IO_APIC() only call record old ioapic/pin for timer and
clear_IO_APIC.

we need clear_IO_APIC before enable setup error vector, in case there
is wrong setting in ioapic by BIOS...

YH

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 20:53                             ` Yinghai Lu
@ 2008-06-09 21:00                               ` Maciej W. Rozycki
  2008-06-10  2:46                                 ` Maciej W. Rozycki
  0 siblings, 1 reply; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-09 21:00 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Mon, 9 Jun 2008, Yinghai Lu wrote:

> we need clear_IO_APIC before enable setup error vector, in case there
> is wrong setting in ioapic by BIOS...

 Fair enough, although it is still interesting why it would only trigger 
in the 64-bit mode and why shouldn't the BIOS be fixed instead.

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 20:12                           ` Maciej W. Rozycki
  2008-06-09 20:53                             ` Yinghai Lu
@ 2008-06-09 21:02                             ` Glauber Costa
  1 sibling, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2008-06-09 21:02 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-kernel, akpm, tglx, mingo, hugh

Maciej W. Rozycki wrote:
> On Mon, 9 Jun 2008, Glauber Costa wrote:
> 
>> This one does enable_IO_APIC in the init_uniprocessor too, and should 
>> account for the !smp case.
> 
>  Hmm, it looks a little bit better, but why do you want to call
> enable_IO_APIC() separately in the first place?  There is a comment
> stating: "Enable IO APIC before setting up error vector," but why is it
> needed on 64-bit systems?  Especially as the very same system may run a
> 32-bit kernel and then it suddenly would not have to do this anymore?  
> Strange...

This was reported by Yinghai, but I think he already answered to that.

>  Also since you are cleaning up this code -- why don't you actually take
> the opportunity and get rid of the horrible #ifdefs interspersed
> throughout?
throughout where?
They're all over the place ;)

My next target would be per-cpu data. But that's because there's _a lot_ 
of code in the tree that got ifdefs between 32 and 64-bit because of 
differences in that, specially irq statistics. A macro would do, but if 
we're gonna do it, let's do it right.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-09 21:00                               ` Maciej W. Rozycki
@ 2008-06-10  2:46                                 ` Maciej W. Rozycki
  2008-06-10  5:08                                   ` Yinghai Lu
  0 siblings, 1 reply; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-10  2:46 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Mon, 9 Jun 2008, Maciej W. Rozycki wrote:

> > we need clear_IO_APIC before enable setup error vector, in case there
> > is wrong setting in ioapic by BIOS...
> 
>  Fair enough, although it is still interesting why it would only trigger 
> in the 64-bit mode and why shouldn't the BIOS be fixed instead.

 Then again -- what if X86_LOCAL_APIC is set, but X86_IO_APIC is not?

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10  2:46                                 ` Maciej W. Rozycki
@ 2008-06-10  5:08                                   ` Yinghai Lu
  2008-06-10 13:00                                     ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Yinghai Lu @ 2008-06-10  5:08 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Mon, Jun 9, 2008 at 7:46 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> On Mon, 9 Jun 2008, Maciej W. Rozycki wrote:
>
>> > we need clear_IO_APIC before enable setup error vector, in case there
>> > is wrong setting in ioapic by BIOS...
>>
>>  Fair enough, although it is still interesting why it would only trigger
>> in the 64-bit mode and why shouldn't the BIOS be fixed instead.
>
>  Then again -- what if X86_LOCAL_APIC is set, but X86_IO_APIC is not?

config X86_LOCAL_APIC
        def_bool y
        depends on X86_64 || (X86_32 && (X86_UP_APIC || ((X86_VISWS ||
SMP) && !X86_VOYAGER) || X86_GENERICARCH))

config X86_IO_APIC
        def_bool y
        depends on X86_64 || (X86_32 && (X86_UP_IOAPIC || (SMP &&
!(X86_VISWS || X86_VOYAGER)) || X86_GENERICARCH))

for 64bit, those are all set.

for 32bit, may need to null stub if X86_IO_APIC is not set

YH

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10  5:08                                   ` Yinghai Lu
@ 2008-06-10 13:00                                     ` Glauber Costa
  2008-06-10 13:30                                       ` Maciej W. Rozycki
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2008-06-10 13:00 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Maciej W. Rozycki, linux-kernel, akpm, tglx, mingo, hugh

Yinghai Lu wrote:
> On Mon, Jun 9, 2008 at 7:46 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
>> On Mon, 9 Jun 2008, Maciej W. Rozycki wrote:
>>
>>>> we need clear_IO_APIC before enable setup error vector, in case there
>>>> is wrong setting in ioapic by BIOS...
>>>  Fair enough, although it is still interesting why it would only trigger
>>> in the 64-bit mode and why shouldn't the BIOS be fixed instead.
>>  Then again -- what if X86_LOCAL_APIC is set, but X86_IO_APIC is not?
> 
> config X86_LOCAL_APIC
>         def_bool y
>         depends on X86_64 || (X86_32 && (X86_UP_APIC || ((X86_VISWS ||
> SMP) && !X86_VOYAGER) || X86_GENERICARCH))
> 
> config X86_IO_APIC
>         def_bool y
>         depends on X86_64 || (X86_32 && (X86_UP_IOAPIC || (SMP &&
> !(X86_VISWS || X86_VOYAGER)) || X86_GENERICARCH))
> 
> for 64bit, those are all set.
> 
> for 32bit, may need to null stub if X86_IO_APIC is not set
> 
> YH
Fair Enough.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10 13:00                                     ` Glauber Costa
@ 2008-06-10 13:30                                       ` Maciej W. Rozycki
  2008-06-10 19:09                                         ` Yinghai Lu
  0 siblings, 1 reply; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-10 13:30 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Yinghai Lu, linux-kernel, akpm, tglx, mingo, hugh

On Tue, 10 Jun 2008, Glauber Costa wrote:

> >>  Then again -- what if X86_LOCAL_APIC is set, but X86_IO_APIC is not?
> > 
> > config X86_LOCAL_APIC
> >         def_bool y
> >         depends on X86_64 || (X86_32 && (X86_UP_APIC || ((X86_VISWS ||
> > SMP) && !X86_VOYAGER) || X86_GENERICARCH))
> > 
> > config X86_IO_APIC
> >         def_bool y
> >         depends on X86_64 || (X86_32 && (X86_UP_IOAPIC || (SMP &&
> > !(X86_VISWS || X86_VOYAGER)) || X86_GENERICARCH))
> > 
> > for 64bit, those are all set.
> > 
> > for 32bit, may need to null stub if X86_IO_APIC is not set
> > 
> > YH
> Fair Enough.

 That does not answer the question what to do with a 32-bit kernel run on
a system that requires the fixup in the 64-bit mode.  Papering over such
firmware bugs in the kernel encourages hardware vendors to maintain the
breakage.

 Note that the I/O APIC comes out of reset with all the redirection
entries masked out, so if any error conditions are triggered before Linux
configures the chip, they are a result of a deliberate misprogramming done
to the chip by the firmware.

 Does anyone have a reference to the original problem report which lead to
this workaround having been put in place?

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10 13:30                                       ` Maciej W. Rozycki
@ 2008-06-10 19:09                                         ` Yinghai Lu
  2008-06-10 19:36                                           ` Maciej W. Rozycki
  0 siblings, 1 reply; 38+ messages in thread
From: Yinghai Lu @ 2008-06-10 19:09 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Tue, Jun 10, 2008 at 6:30 AM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> On Tue, 10 Jun 2008, Glauber Costa wrote:
>
>> >>  Then again -- what if X86_LOCAL_APIC is set, but X86_IO_APIC is not?
>> >
>> > config X86_LOCAL_APIC
>> >         def_bool y
>> >         depends on X86_64 || (X86_32 && (X86_UP_APIC || ((X86_VISWS ||
>> > SMP) && !X86_VOYAGER) || X86_GENERICARCH))
>> >
>> > config X86_IO_APIC
>> >         def_bool y
>> >         depends on X86_64 || (X86_32 && (X86_UP_IOAPIC || (SMP &&
>> > !(X86_VISWS || X86_VOYAGER)) || X86_GENERICARCH))
>> >
>> > for 64bit, those are all set.
>> >
>> > for 32bit, may need to null stub if X86_IO_APIC is not set
>> >
>> > YH
>> Fair Enough.
>
>  That does not answer the question what to do with a 32-bit kernel run on
> a system that requires the fixup in the 64-bit mode.  Papering over such
> firmware bugs in the kernel encourages hardware vendors to maintain the
> breakage.

kernel should not assume io apic register is set right by firmware.
and kernel actually doesn't trust them, and clear the io apic registers.
that patch just move that early before enable error vector.

YH

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10 19:09                                         ` Yinghai Lu
@ 2008-06-10 19:36                                           ` Maciej W. Rozycki
  2008-06-10 19:49                                             ` Yinghai Lu
  0 siblings, 1 reply; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-10 19:36 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Tue, 10 Jun 2008, Yinghai Lu wrote:

> kernel should not assume io apic register is set right by firmware.

 What is the basis of this assumption?  Linux generally assumes the
chipset components have been placed by the firmware into a consistent
state.  That does not necessarily mean suitable for Linux, hence the need
to reconfigure a bit here or there, but there should be no need to touch
components that are not going to be used by Linux directly.  The I/O APIC
is no different.

> and kernel actually doesn't trust them, and clear the io apic registers.

 The I/O APIC registers are cleared, because this is the only way you can
assure inconsistent configuration does not happen during reconfiguration.  
For example two inputs using the same vector or set up into the ExtINTA
mode.  The original intent of the code was not to paper over breakage in
the firmware.  You are trying to change it and it can be done, but it has
to be justified well.

> that patch just move that early before enable error vector.

 What I am saying repeatedly is clearing of the I/O APIC is not guaranteed
to happen for all the possible cases of Linux configuration.  Which means
this is a partial solution only -- please try to propose a better one or
provide the original problem report so that someone else can have a look
at it.  What's triggering the error interrupt for example?  Is it
recoverable?

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10 19:36                                           ` Maciej W. Rozycki
@ 2008-06-10 19:49                                             ` Yinghai Lu
  2008-06-11  0:29                                               ` Maciej W. Rozycki
  0 siblings, 1 reply; 38+ messages in thread
From: Yinghai Lu @ 2008-06-10 19:49 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Tue, Jun 10, 2008 at 12:36 PM, Maciej W. Rozycki
<macro@linux-mips.org> wrote:
> On Tue, 10 Jun 2008, Yinghai Lu wrote:
>
>> kernel should not assume io apic register is set right by firmware.
>
>  What is the basis of this assumption?  Linux generally assumes the
> chipset components have been placed by the firmware into a consistent
> state.  That does not necessarily mean suitable for Linux, hence the need
> to reconfigure a bit here or there, but there should be no need to touch
> components that are not going to be used by Linux directly.  The I/O APIC
> is no different.
>
>> and kernel actually doesn't trust them, and clear the io apic registers.
>
>  The I/O APIC registers are cleared, because this is the only way you can
> assure inconsistent configuration does not happen during reconfiguration.
> For example two inputs using the same vector or set up into the ExtINTA
> mode.  The original intent of the code was not to paper over breakage in
> the firmware.  You are trying to change it and it can be done, but it has
> to be justified well.
>
>> that patch just move that early before enable error vector.
>
>  What I am saying repeatedly is clearing of the I/O APIC is not guaranteed
> to happen for all the possible cases of Linux configuration.  Which means
> this is a partial solution only -- please try to propose a better one or
> provide the original problem report so that someone else can have a look
> at it.  What's triggering the error interrupt for example?  Is it
> recoverable?

ExtINT is routed to ioapic pin0. but the dst is set to 0.
and the systems has multi sockets with quadcore cpu, so the apic id of boot cpu
is set to 4 instead of 0

YH

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-10 19:49                                             ` Yinghai Lu
@ 2008-06-11  0:29                                               ` Maciej W. Rozycki
  2008-06-11  2:32                                                 ` Yinghai Lu
  0 siblings, 1 reply; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-11  0:29 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Tue, 10 Jun 2008, Yinghai Lu wrote:

> ExtINT is routed to ioapic pin0. but the dst is set to 0.
> and the systems has multi sockets with quadcore cpu, so the apic id of boot cpu
> is set to 4 instead of 0

 Thanks for the info.  Let me understand the situation better: local APIC
IDs are preassigned by the firmware based on their "socket address" and
the socket where the lowest numbered quad would be is empty.  
Nevertheless the firmware sets the destination ID of the ExtINTA interrupt
in the I/O APIC to 0 rather than the ID of the bootstrap CPU.  Is that
correct?

 But it would mean the Virtual Wire interrupt delivery would not work, or
is the I/O APIC setup redundant and the local APIC of the bootstrap CPU is
set up for ExtINTA delivery as well?

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-11  0:29                                               ` Maciej W. Rozycki
@ 2008-06-11  2:32                                                 ` Yinghai Lu
  2008-06-11 12:57                                                   ` Maciej W. Rozycki
  0 siblings, 1 reply; 38+ messages in thread
From: Yinghai Lu @ 2008-06-11  2:32 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Tue, Jun 10, 2008 at 5:29 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> On Tue, 10 Jun 2008, Yinghai Lu wrote:
>
>> ExtINT is routed to ioapic pin0. but the dst is set to 0.
>> and the systems has multi sockets with quadcore cpu, so the apic id of boot cpu
>> is set to 4 instead of 0
>
>  Thanks for the info.  Let me understand the situation better: local APIC
> IDs are preassigned by the firmware based on their "socket address" and
> the socket where the lowest numbered quad would be is empty.
> Nevertheless the firmware sets the destination ID of the ExtINTA interrupt
> in the I/O APIC to 0 rather than the ID of the bootstrap CPU.  Is that
> correct?

Yes

after I asked bios engineer to change the dest apic id to 4, the error is gone.

before clear_IO_APIC()
number of MP IRQ sources: 15.
number of IO-APIC #0 registers: 24.
number of IO-APIC #1 registers: 7.
number of IO-APIC #2 registers: 7.
number of IO-APIC #3 registers: 24.
testing the IO APIC.......................

IO APIC #0......
.... register #00: 00000000
.......    : physical APIC id: 00
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0011
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
 00 004 0    0    0   0   0    0    7    00
 01 000 1    0    0   0   0    0    0    00
 02 000 1    0    0   0   0    0    0    00
 03 000 1    0    0   0   0    0    0    00
 04 000 1    0    0   0   0    0    0    00
 05 000 1    0    0   0   0    0    0    00
 06 000 1    0    0   0   0    0    0    00
 07 000 1    0    0   0   0    0    0    00
 08 000 1    0    0   0   0    0    0    00
 09 000 1    0    0   0   0    0    0    00
 0a 000 1    0    0   0   0    0    0    00
 0b 000 1    0    0   0   0    0    0    00
 0c 000 1    0    0   0   0    0    0    00
 0d 000 1    0    0   0   0    0    0    00
 0e 000 1    0    0   0   0    0    0    00
 0f 000 1    0    0   0   0    0    0    00
 10 000 1    0    0   0   0    0    0    00
 11 000 1    0    0   0   0    0    0    00
 12 000 1    0    0   0   0    0    0    00
 13 000 1    0    0   0   0    0    0    00
 14 000 1    0    0   0   0    0    0    00
 15 000 1    0    0   0   0    0    0    00
 16 000 1    0    0   0   0    0    0    00
 17 000 1    0    0   0   0    0    0    00


>
>  But it would mean the Virtual Wire interrupt delivery would not work, or
> is the I/O APIC setup redundant and the local APIC of the bootstrap CPU is
> set up for ExtINTA delivery as well?

it doesn't need to virtual wire for timer (ExtInt), timer is hpet and
routed to ioapic pin2.

Just know at first BIOS engineer refused to change that to 4, because
other os is not happy.

YH

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus
  2008-06-11  2:32                                                 ` Yinghai Lu
@ 2008-06-11 12:57                                                   ` Maciej W. Rozycki
  0 siblings, 0 replies; 38+ messages in thread
From: Maciej W. Rozycki @ 2008-06-11 12:57 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Glauber Costa, linux-kernel, akpm, tglx, mingo, hugh

On Tue, 10 Jun 2008, Yinghai Lu wrote:

> >  Thanks for the info.  Let me understand the situation better: local APIC
> > IDs are preassigned by the firmware based on their "socket address" and
> > the socket where the lowest numbered quad would be is empty.
> > Nevertheless the firmware sets the destination ID of the ExtINTA interrupt
> > in the I/O APIC to 0 rather than the ID of the bootstrap CPU.  Is that
> > correct?
> 
> Yes
> 
> after I asked bios engineer to change the dest apic id to 4, the error
> is gone.

 Thanks for the clarification.

> >  But it would mean the Virtual Wire interrupt delivery would not work, or
> > is the I/O APIC setup redundant and the local APIC of the bootstrap CPU is
> > set up for ExtINTA delivery as well?
> 
> it doesn't need to virtual wire for timer (ExtInt), timer is hpet and
> routed to ioapic pin2.

 That's not what I asked about -- the timer does not matter here.  The 
Virtual Wire mode is a configuration, where one input of one APIC in the 
system is set up for the ExtINTA mode and acts transparently with the 
system software having no need to know about it.  Instead a pair of 
legacy 8259A chips is used to deliver interrupts, including claiming the 
INTA cycles, providing vectors and prioritising sources, as defined in the 
PC/AT architecture.

 Many pieces of software rely on the 8259A PICs, either because they 
predate the APIC or because they have no means to make use of 
multiprocessor features anyway.  They include various versions of DOS 
together with software run in that environment (as DOS programs quite 
frequently drive hardware at the low level), many versions of the 
Microsoft Windows system as well as other software.  For these a legacy 
mode, either the Virtual Wire mode, or a mode where 8259A interrupts are 
delivered directly to one processor's INT line has to be implemented as 
mandated both by the Multiprocessor Specification and the Advanced 
Configuration and Power Interface Specification.

 Coming back to my question -- how is such a mode implemented in the 
affected system?  Clearly the route through the I/O APIC is not going to 
work and moreover, the chip clutters the system with broken interrupt 
messages each time the 8259A signals an interrupt.

 Please note Linux can use the Virtual Wire mode in the APIC/SMP mode too,
if requested by specifying the "noapic" command-line option -- have you 
verified the option works correctly with the affected system?

> Just know at first BIOS engineer refused to change that to 4, because
> other os is not happy.

 Well, this is just a confirmation my attitude is correct -- such problems 
should not be papered over, because vendors will deny their existence 
then.  At least a complaint message should be printed so that users have 
an opportunity to see it and ask their hardware supplier for an 
explanation.

 In this case, a workaround for the 64-bit mode happens to be quite cheap,
but it should be extended to cover the 32-bit mode as well.  The only
solution for the 32-bit mode I have in mind would lead to a waste of
resources for many users of good hardware.  And this because of somebody's
sloppiness -- as I have written -- this better be well justified.

 Unless you have precise means to identify this system -- in that case I
think reconfiguring the bootstrap processor's local APIC ID to 0 would be
the right approach.  Have you tried it?

  Maciej

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2008-06-11 13:00 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-09 14:16 [PATCH 0/15] Improve x86 smpboot integration Glauber Costa
2008-06-09 14:16 ` [PATCH 01/15] x86: use stack_start in x86_64 Glauber Costa
2008-06-09 14:16   ` [PATCH 02/15] x86: don't use gdt_page openly Glauber Costa
2008-06-09 14:16     ` [PATCH 03/15] x86: remove early_gdt_descr reference Glauber Costa
2008-06-09 14:16       ` [PATCH 04/15] x86: move x86_64 gdt closer to i386 Glauber Costa
2008-06-09 14:16         ` [PATCH 05/15] x86: use initial_code for i386 Glauber Costa
2008-06-09 14:16           ` [PATCH 06/15] x86: boot secondary cpus through initial_code Glauber Costa
2008-06-09 14:16             ` [PATCH 07/15] x86: clearing io_apic harmless for x86_64 Glauber Costa
2008-06-09 14:16               ` [PATCH 08/15] x86: remove ifdef from stepping Glauber Costa
2008-06-09 14:16                 ` [PATCH 09/15] x86: change __setup_vector_irq with setup_vector_irq Glauber Costa
2008-06-09 14:16                   ` [PATCH 10/15] x86: provide connect_bsp_APIC for x86_64 Glauber Costa
2008-06-09 14:16                     ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Glauber Costa
2008-06-09 14:16                       ` [PATCH 12/15] x86: change naming to match x86_64 Glauber Costa
2008-06-09 14:16                         ` [PATCH 13/15] x86: remove cpu from maps Glauber Costa
2008-06-09 14:16                           ` [PATCH 14/15] x86: move cpu_exit_clear to process_32.c Glauber Costa
2008-06-09 14:16                             ` [PATCH 15/15] x86: take load_sp0 out of smpboot.c Glauber Costa
2008-06-09 15:23                       ` [PATCH 11/15] x86: move enabling of io_apic to prepare_cpus Maciej W. Rozycki
2008-06-09 15:52                         ` Glauber Costa
2008-06-09 19:44                         ` Glauber Costa
2008-06-09 20:12                           ` Maciej W. Rozycki
2008-06-09 20:53                             ` Yinghai Lu
2008-06-09 21:00                               ` Maciej W. Rozycki
2008-06-10  2:46                                 ` Maciej W. Rozycki
2008-06-10  5:08                                   ` Yinghai Lu
2008-06-10 13:00                                     ` Glauber Costa
2008-06-10 13:30                                       ` Maciej W. Rozycki
2008-06-10 19:09                                         ` Yinghai Lu
2008-06-10 19:36                                           ` Maciej W. Rozycki
2008-06-10 19:49                                             ` Yinghai Lu
2008-06-11  0:29                                               ` Maciej W. Rozycki
2008-06-11  2:32                                                 ` Yinghai Lu
2008-06-11 12:57                                                   ` Maciej W. Rozycki
2008-06-09 21:02                             ` Glauber Costa
2008-06-09 15:23       ` [PATCH 03/15] x86: remove early_gdt_descr reference James Bottomley
2008-06-09 15:49         ` Glauber Costa
2008-06-09 17:20           ` James Bottomley
2008-06-09 17:23             ` Glauber Costa
2008-06-09 17:40               ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox