public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* i386 cpu hotplug bug - instant reboot when onlining secondary
@ 2006-02-19 23:58 Nathan Lynch
  2006-02-21 16:20 ` Zwane Mwaikambo
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Lynch @ 2006-02-19 23:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Zwane Mwaikambo

Hi-

On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
series, but wasn't able to collect more information until now.  Not
sure when this last worked, unfortunately.

With the debugging patch below, I get this on serial console:

[17179681.704000] CPU 1 is now offline
[17179686.908000] Booting processor 1/1 eip 3000
[17179686.912000] CPU 1 irqstacks, hard=78383000 soft=7837b000
[17179686.920000] Setting warm reset code and vector.
[17179686.924000] 1.
[17179686.924000] 2.
[17179686.928000] 3.
[17179686.928000] Asserting INIT.
[17179686.932000] Waiting for send to finish...
[17179686.936000] +<7>Deasserting INIT.
[17179686.952000] Waiting for send to finish...
[17179686.956000] +<7>#startup loops: 2.
[17179686.960000] Sending STARTUP #1.
[17179686.960000] After apic_write.
[17179686.964000] Doing apic_write_around for target chip...
[17179686.972000] Doing apic_write_around to kick the second...

Any suggestions?


diff --git a/arch/i386/kernel/smpboot.c b/arch/i386/kernel/smpboot.c
index fb00ab7..85aff00 100644
--- a/arch/i386/kernel/smpboot.c
+++ b/arch/i386/kernel/smpboot.c
@@ -801,10 +801,12 @@ wakeup_secondary_cpu(int phys_apicid, un
 		 */
 
 		/* Target chip */
+		Dprintk("Doing apic_write_around for target chip...\n");
 		apic_write_around(APIC_ICR2, SET_APIC_DEST_FIELD(phys_apicid));
 
 		/* Boot on the stack */
 		/* Kick the second */
+		Dprintk("Doing apic_write_around to kick the second...\n");
 		apic_write_around(APIC_ICR, APIC_DM_STARTUP
 					| (start_eip >> 12));
 
diff --git a/include/asm-i386/apic.h b/include/asm-i386/apic.h
index d30b857..2c8dcfa 100644
--- a/include/asm-i386/apic.h
+++ b/include/asm-i386/apic.h
@@ -8,7 +8,7 @@
 #include <asm/processor.h>
 #include <asm/system.h>
 
-#define Dprintk(x...)
+#define Dprintk(fmt,arg...) printk(KERN_DEBUG fmt,##arg)
 
 /*
  * Debugging macros

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-02-19 23:58 i386 cpu hotplug bug - instant reboot when onlining secondary Nathan Lynch
@ 2006-02-21 16:20 ` Zwane Mwaikambo
  2006-02-27  7:50   ` Nathan Lynch
  0 siblings, 1 reply; 9+ messages in thread
From: Zwane Mwaikambo @ 2006-02-21 16:20 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: linux-kernel

On Sun, 19 Feb 2006, Nathan Lynch wrote:

> On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
> box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
> series, but wasn't able to collect more information until now.  Not
> sure when this last worked, unfortunately.
> 
> With the debugging patch below, I get this on serial console:

Does 2.6.14 work? Also i wonder if it gets out of the trampoline...

Index: linux-2.6.16-rc2/arch/i386/kernel/smpboot.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.16-rc2/arch/i386/kernel/smpboot.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 smpboot.c
--- linux-2.6.16-rc2/arch/i386/kernel/smpboot.c	11 Feb 2006 18:55:06 -0000	1.1.1.1
+++ linux-2.6.16-rc2/arch/i386/kernel/smpboot.c	21 Feb 2006 16:19:22 -0000
@@ -514,6 +514,7 @@ static void __devinit start_secondary(vo
 	cpu_init();
 	preempt_disable();
 	smp_callin();
+	Dprintk("startup_secondary\n");
 	while (!cpu_isset(smp_processor_id(), smp_commenced_mask))
 		rep_nop();
 	setup_secondary_APIC_clock();

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-02-21 16:20 ` Zwane Mwaikambo
@ 2006-02-27  7:50   ` Nathan Lynch
  2006-02-28 15:40     ` Zwane Mwaikambo
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Lynch @ 2006-02-27  7:50 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: linux-kernel

Zwane Mwaikambo wrote:
> On Sun, 19 Feb 2006, Nathan Lynch wrote:
> 
> > On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
> > box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
> > series, but wasn't able to collect more information until now.  Not
> > sure when this last worked, unfortunately.
> > 
> > With the debugging patch below, I get this on serial console:
> 
> Does 2.6.14 work? Also i wonder if it gets out of the trampoline...

2.6.14 works (albeit with an APIC error reported).  When retesting
2.6.16-rc4 with your patch on top of my debugging patch, I don't see the
"startup_secondary" line:

[17179709.100000] CPU 1 is now offline
[17179714.636000] Booting processor 1/1 eip 3000
[17179714.688000] CPU 1 irqstacks, hard=7837f000 soft=78377000
[17179714.756000] Setting warm reset code and vector.
[17179714.812000] 1.
[17179714.836000] 2.
[17179714.860000] 3.
[17179714.880000] Asserting INIT.
[17179714.916000] Waiting for send to finish...
[17179714.968000] +<7>Deasserting INIT.
[17179715.024000] Waiting for send to finish...
[17179715.072000] +<7>#startup loops: 2.
[17179715.116000] Sending STARTUP #1.
[17179715.160000] After apic_write.
[17179715.196000] Doing apic_write_around for target chip...
[17179715.260000] Doing apic_write_around to kick the second...

> 
> Index: linux-2.6.16-rc2/arch/i386/kernel/smpboot.c
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.16-rc2/arch/i386/kernel/smpboot.c,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 smpboot.c
> --- linux-2.6.16-rc2/arch/i386/kernel/smpboot.c	11 Feb 2006 18:55:06 -0000	1.1.1.1
> +++ linux-2.6.16-rc2/arch/i386/kernel/smpboot.c	21 Feb 2006 16:19:22 -0000
> @@ -514,6 +514,7 @@ static void __devinit start_secondary(vo
>  	cpu_init();
>  	preempt_disable();
>  	smp_callin();
> +	Dprintk("startup_secondary\n");
>  	while (!cpu_isset(smp_processor_id(), smp_commenced_mask))
>  		rep_nop();
>  	setup_secondary_APIC_clock();

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-02-27  7:50   ` Nathan Lynch
@ 2006-02-28 15:40     ` Zwane Mwaikambo
  2006-02-28 21:34       ` Nathan Lynch
  0 siblings, 1 reply; 9+ messages in thread
From: Zwane Mwaikambo @ 2006-02-28 15:40 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: Linux Kernel

On Mon, 27 Feb 2006, Nathan Lynch wrote:

> Zwane Mwaikambo wrote:
> > On Sun, 19 Feb 2006, Nathan Lynch wrote:
> > 
> > > On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
> > > box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
> > > series, but wasn't able to collect more information until now.  Not
> > > sure when this last worked, unfortunately.
> > > 
> > > With the debugging patch below, I get this on serial console:
> > 
> > Does 2.6.14 work? Also i wonder if it gets out of the trampoline...
> 
> 2.6.14 works (albeit with an APIC error reported).  When retesting
> 2.6.16-rc4 with your patch on top of my debugging patch, I don't see the
> "startup_secondary" line:

Hi Nathan,

Can you try the following patch? We can start moving the WARM_BOOT_HLT 
down until it triple faults (i'm assuming it at least gets this far).

Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S
===================================================================
RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 head.S
--- linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	11 Feb 2006 16:55:14 -0000	1.1.1.1
+++ linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	28 Feb 2006 15:34:34 -0000
@@ -146,6 +146,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
  * we know the trampoline has already loaded the boot_gdt_table GDT
  * for us.
  */
+#define warm_boot	tsc_sync_disabled-__PAGE_OFFSET
+#define WARM_BOOT_HLT		\
+	cmpl	$0, warm_boot;	\
+10:				\
+	jne	10b
+
 ENTRY(startup_32_smp)
 	cld
 	movl $(__BOOT_DS),%eax
@@ -168,6 +174,8 @@ ENTRY(startup_32_smp)
  *	NOTE! We have to correct for the fact that we're
  *	not yet offset PAGE_OFFSET..
  */
+	WARM_BOOT_HLT
+
 #define cr4_bits mmu_cr4_features-__PAGE_OFFSET
 	movl cr4_bits,%edx
 	andl %edx,%edx
Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 smpboot.c
--- linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	11 Feb 2006 16:55:14 -0000	1.1.1.1
+++ linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	28 Feb 2006 15:34:42 -0000
@@ -102,7 +102,7 @@ static cpumask_t smp_commenced_mask;
  * is no way to resync one AP against BP. TBD: for prescott and above, we
  * should use IA64's algorithm
  */
-static int __devinitdata tsc_sync_disabled;
+int __devinitdata tsc_sync_disabled;
 
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-02-28 15:40     ` Zwane Mwaikambo
@ 2006-02-28 21:34       ` Nathan Lynch
  2006-02-28 22:13         ` Zwane Mwaikambo
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Lynch @ 2006-02-28 21:34 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel

Zwane Mwaikambo wrote:
> On Mon, 27 Feb 2006, Nathan Lynch wrote:
> 
> > Zwane Mwaikambo wrote:
> > > On Sun, 19 Feb 2006, Nathan Lynch wrote:
> > > 
> > > > On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
> > > > box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
> > > > series, but wasn't able to collect more information until now.  Not
> > > > sure when this last worked, unfortunately.
> > > > 
> > > > With the debugging patch below, I get this on serial console:
> > > 
> > > Does 2.6.14 work? Also i wonder if it gets out of the trampoline...
> > 
> > 2.6.14 works (albeit with an APIC error reported).  When retesting
> > 2.6.16-rc4 with your patch on top of my debugging patch, I don't see the
> > "startup_secondary" line:
> 
> Hi Nathan,
> 
> Can you try the following patch? We can start moving the WARM_BOOT_HLT 
> down until it triple faults (i'm assuming it at least gets this far).

Here's what I got with this one on top of a day-old -git (all
debugging patches still applied):

[17179725.020000] CPU 1 is now offline
[17179730.900000] Booting processor 1/1 eip 3000
[17179730.952000] CPU 1 irqstacks, hard=7837f000 soft=78377000
[17179731.020000] Setting warm reset code and vector.
[17179731.076000] 1.
[17179731.100000] 2.
[17179731.120000] 3.
[17179731.144000] Asserting INIT.
[17179731.180000] Waiting for send to finish...
[17179731.232000] +<7>Deasserting INIT.
[17179731.284000] Waiting for send to finish...
[17179731.336000] +<7>#startup loops: 2.
[17179731.380000] Sending STARTUP #1.
[17179731.420000] After apic_write.
[17179731.460000] Doing apic_write_around for target chip...
[17179731.524000] Doing apic_write_around to kick the second...
[17179731.592000] Startup point 1.
[17179731.632000] Waiting for send to finish...
[17179731.680000] +<7>Sending STARTUP #2.
[17179731.728000] After apic_write.
[17179731.768000] Doing apic_write_around for target chip...
[17179731.832000] Doing apic_write_around to kick the second...
[17179731.900000] Startup point 1.
[17179731.936000] Waiting for send to finish...
[17179731.988000] +<7>After Startup.
[17179732.028000] Before Callout 1.
[17179732.068000] After Callout 1.
[17179737.080000] Stuck ??
[17179737.108000] Inquiring remote APIC #1...
[17179737.156000] ... APIC #1 ID: 01000000
[17179737.204000] ... APIC #1 VERSION: 00040011
[17179737.256000] ... APIC #1 SPIV: 000000ff


> 
> Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 head.S
> --- linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	11 Feb 2006 16:55:14 -0000	1.1.1.1
> +++ linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	28 Feb 2006 15:34:34 -0000
> @@ -146,6 +146,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
>   * we know the trampoline has already loaded the boot_gdt_table GDT
>   * for us.
>   */
> +#define warm_boot	tsc_sync_disabled-__PAGE_OFFSET
> +#define WARM_BOOT_HLT		\
> +	cmpl	$0, warm_boot;	\
> +10:				\
> +	jne	10b
> +
>  ENTRY(startup_32_smp)
>  	cld
>  	movl $(__BOOT_DS),%eax
> @@ -168,6 +174,8 @@ ENTRY(startup_32_smp)
>   *	NOTE! We have to correct for the fact that we're
>   *	not yet offset PAGE_OFFSET..
>   */
> +	WARM_BOOT_HLT
> +
>  #define cr4_bits mmu_cr4_features-__PAGE_OFFSET
>  	movl cr4_bits,%edx
>  	andl %edx,%edx
> Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 smpboot.c
> --- linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	11 Feb 2006 16:55:14 -0000	1.1.1.1
> +++ linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	28 Feb 2006 15:34:42 -0000
> @@ -102,7 +102,7 @@ static cpumask_t smp_commenced_mask;
>   * is no way to resync one AP against BP. TBD: for prescott and above, we
>   * should use IA64's algorithm
>   */
> -static int __devinitdata tsc_sync_disabled;
> +int __devinitdata tsc_sync_disabled;
>  
>  /* Per CPU bogomips and other parameters */
>  struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-02-28 21:34       ` Nathan Lynch
@ 2006-02-28 22:13         ` Zwane Mwaikambo
  2006-03-01  3:28           ` Nathan Lynch
  0 siblings, 1 reply; 9+ messages in thread
From: Zwane Mwaikambo @ 2006-02-28 22:13 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: Linux Kernel

On Tue, 28 Feb 2006, Nathan Lynch wrote:

> Zwane Mwaikambo wrote:
> > On Mon, 27 Feb 2006, Nathan Lynch wrote:
> > 
> > > Zwane Mwaikambo wrote:
> > > > On Sun, 19 Feb 2006, Nathan Lynch wrote:
> > > > 
> > > > > On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
> > > > > box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
> > > > > series, but wasn't able to collect more information until now.  Not
> > > > > sure when this last worked, unfortunately.
> > > > > 
> > > > > With the debugging patch below, I get this on serial console:
> > > > 
> > > > Does 2.6.14 work? Also i wonder if it gets out of the trampoline...
> > > 
> > > 2.6.14 works (albeit with an APIC error reported).  When retesting
> > > 2.6.16-rc4 with your patch on top of my debugging patch, I don't see the
> > > "startup_secondary" line:
> > 
> > Hi Nathan,
> > 
> > Can you try the following patch? We can start moving the WARM_BOOT_HLT 
> > down until it triple faults (i'm assuming it at least gets this far).
> 
> Here's what I got with this one on top of a day-old -git (all
> debugging patches still applied):

Looks good, how about the following

Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S
===================================================================
RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 head.S
--- linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	11 Feb 2006 16:55:14 -0000	1.1.1.1
+++ linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	28 Feb 2006 22:12:25 -0000
@@ -146,6 +146,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
  * we know the trampoline has already loaded the boot_gdt_table GDT
  * for us.
  */
+#define warm_boot	tsc_sync_disabled-__PAGE_OFFSET
+#define WARM_BOOT_HLT		\
+	cmpl	$0, warm_boot;	\
+10:				\
+	jne	10b
+
 ENTRY(startup_32_smp)
 	cld
 	movl $(__BOOT_DS),%eax
@@ -324,6 +330,7 @@ is386:	movl $2,%ecx		# set MP
 	cmpb $0,%cl
 	je 1f			# the first CPU calls start_kernel
 				# all other CPUs call initialize_secondary
+	WARM_BOOT_HLT
 	call initialize_secondary
 	jmp L6
 1:
Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 smpboot.c
--- linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	11 Feb 2006 16:55:14 -0000	1.1.1.1
+++ linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	28 Feb 2006 15:34:42 -0000
@@ -102,7 +102,7 @@ static cpumask_t smp_commenced_mask;
  * is no way to resync one AP against BP. TBD: for prescott and above, we
  * should use IA64's algorithm
  */
-static int __devinitdata tsc_sync_disabled;
+int __devinitdata tsc_sync_disabled;
 
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-02-28 22:13         ` Zwane Mwaikambo
@ 2006-03-01  3:28           ` Nathan Lynch
  2006-03-01  6:31             ` Zwane Mwaikambo
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Lynch @ 2006-03-01  3:28 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel

Zwane Mwaikambo wrote:
> On Tue, 28 Feb 2006, Nathan Lynch wrote:
> 
> > Zwane Mwaikambo wrote:
> > > On Mon, 27 Feb 2006, Nathan Lynch wrote:
> > > 
> > > > Zwane Mwaikambo wrote:
> > > > > On Sun, 19 Feb 2006, Nathan Lynch wrote:
> > > > > 
> > > > > > On a dual P3 Xeon machine, offlining and then onlining a cpu makes the
> > > > > > box instantly reboot.  I've been seeing this throughout the 2.6.16-rc
> > > > > > series, but wasn't able to collect more information until now.  Not
> > > > > > sure when this last worked, unfortunately.
> > > > > > 
> > > > > > With the debugging patch below, I get this on serial console:
> > > > > 
> > > > > Does 2.6.14 work? Also i wonder if it gets out of the trampoline...
> > > > 
> > > > 2.6.14 works (albeit with an APIC error reported).  When retesting
> > > > 2.6.16-rc4 with your patch on top of my debugging patch, I don't see the
> > > > "startup_secondary" line:
> > > 
> > > Hi Nathan,
> > > 
> > > Can you try the following patch? We can start moving the WARM_BOOT_HLT 
> > > down until it triple faults (i'm assuming it at least gets this far).
> > 
> > Here's what I got with this one on top of a day-old -git (all
> > debugging patches still applied):
> 
> Looks good, how about the following

I now get:

[17179687.244000] CPU 1 is now offline
[17179693.164000] Booting processor 1/1 eip 3000
[17179693.216000] CPU 1 irqstacks, hard=7837f000 soft=78377000
[17179693.284000] Setting warm reset code and vector.
[17179693.340000] 1.
[17179693.364000] 2.
[17179693.388000] 3.
[17179693.408000] Asserting INIT.
[17179693.448000] Waiting for send to finish...
[17179693.496000] +<7>Deasserting INIT.
[17179693.552000] Waiting for send to finish...
[17179693.600000] +<7>#startup loops: 2.
[17179693.644000] Sending STARTUP #1.
[17179693.688000] After apic_write.
[17179693.724000] Doing apic_write_around for target chip...
[17179693.788000] Doing apic_write_around to kick the second...


> Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 head.S
> --- linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	11 Feb 2006 16:55:14 -0000	1.1.1.1
> +++ linux-2.6.16-rc2-mm1/arch/i386/kernel/head.S	28 Feb 2006 22:12:25 -0000
> @@ -146,6 +146,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
>   * we know the trampoline has already loaded the boot_gdt_table GDT
>   * for us.
>   */
> +#define warm_boot	tsc_sync_disabled-__PAGE_OFFSET
> +#define WARM_BOOT_HLT		\
> +	cmpl	$0, warm_boot;	\
> +10:				\
> +	jne	10b
> +
>  ENTRY(startup_32_smp)
>  	cld
>  	movl $(__BOOT_DS),%eax
> @@ -324,6 +330,7 @@ is386:	movl $2,%ecx		# set MP
>  	cmpb $0,%cl
>  	je 1f			# the first CPU calls start_kernel
>  				# all other CPUs call initialize_secondary
> +	WARM_BOOT_HLT
>  	call initialize_secondary
>  	jmp L6
>  1:
> Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 smpboot.c
> --- linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	11 Feb 2006 16:55:14 -0000	1.1.1.1
> +++ linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	28 Feb 2006 15:34:42 -0000
> @@ -102,7 +102,7 @@ static cpumask_t smp_commenced_mask;
>   * is no way to resync one AP against BP. TBD: for prescott and above, we
>   * should use IA64's algorithm
>   */
> -static int __devinitdata tsc_sync_disabled;
> +int __devinitdata tsc_sync_disabled;
>  
>  /* Per CPU bogomips and other parameters */
>  struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-03-01  3:28           ` Nathan Lynch
@ 2006-03-01  6:31             ` Zwane Mwaikambo
  2006-03-06 13:25               ` Nathan Lynch
  0 siblings, 1 reply; 9+ messages in thread
From: Zwane Mwaikambo @ 2006-03-01  6:31 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: Linux Kernel

On Tue, 28 Feb 2006, Nathan Lynch wrote:

> 
> [17179687.244000] CPU 1 is now offline
> [17179693.164000] Booting processor 1/1 eip 3000
> [17179693.216000] CPU 1 irqstacks, hard=7837f000 soft=78377000
> [17179693.284000] Setting warm reset code and vector.
> [17179693.340000] 1.
> [17179693.364000] 2.
> [17179693.388000] 3.
> [17179693.408000] Asserting INIT.
> [17179693.448000] Waiting for send to finish...
> [17179693.496000] +<7>Deasserting INIT.
> [17179693.552000] Waiting for send to finish...
> [17179693.600000] +<7>#startup loops: 2.
> [17179693.644000] Sending STARTUP #1.
> [17179693.688000] After apic_write.
> [17179693.724000] Doing apic_write_around for target chip...
> [17179693.788000] Doing apic_write_around to kick the second...

Ok, could you apply only the following patch?

Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 smpboot.c
--- linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	11 Feb 2006 16:55:14 -0000	1.1.1.1
+++ linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	1 Mar 2006 06:30:06 -0000
@@ -535,9 +535,14 @@ static void __devinit start_secondary(vo
 	 * booting is too fragile that we want to limit the
 	 * things done here to the most necessary things.
 	 */
+	Dprintk("S1\n");
 	cpu_init();
+	Dprintk("S2\n");
 	preempt_disable();
+	Dprintk("S3\n");
 	smp_callin();
+	Dprintk("S4\n");
+
 	while (!cpu_isset(smp_processor_id(), smp_commenced_mask))
 		rep_nop();
 	setup_secondary_APIC_clock();

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: i386 cpu hotplug bug - instant reboot when onlining secondary
  2006-03-01  6:31             ` Zwane Mwaikambo
@ 2006-03-06 13:25               ` Nathan Lynch
  0 siblings, 0 replies; 9+ messages in thread
From: Nathan Lynch @ 2006-03-06 13:25 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel

Zwane Mwaikambo wrote:
> On Tue, 28 Feb 2006, Nathan Lynch wrote:
> 
> > 
> > [17179687.244000] CPU 1 is now offline
> > [17179693.164000] Booting processor 1/1 eip 3000
> > [17179693.216000] CPU 1 irqstacks, hard=7837f000 soft=78377000
> > [17179693.284000] Setting warm reset code and vector.
> > [17179693.340000] 1.
> > [17179693.364000] 2.
> > [17179693.388000] 3.
> > [17179693.408000] Asserting INIT.
> > [17179693.448000] Waiting for send to finish...
> > [17179693.496000] +<7>Deasserting INIT.
> > [17179693.552000] Waiting for send to finish...
> > [17179693.600000] +<7>#startup loops: 2.
> > [17179693.644000] Sending STARTUP #1.
> > [17179693.688000] After apic_write.
> > [17179693.724000] Doing apic_write_around for target chip...
> > [17179693.788000] Doing apic_write_around to kick the second...
> 
> Ok, could you apply only the following patch?

Sorry for the delay in getting back to you.

Applied your latest patch, (plus one-liner to make Dprintk actually
print) -- I don't see any of the new print statements:

[17179687.744000] CPU 1 is now offline
[17179693.032000] Booting processor 1/1 eip 3000
[17179693.084000] CPU 1 irqstacks, hard=783da000 soft=783d2000
[17179693.152000] Setting warm reset code and vector.
[17179693.208000] 1.
[17179693.232000] 2.
[17179693.256000] 3.
[17179693.276000] Asserting INIT.
[17179693.316000] Waiting for send to finish...
[17179693.364000] +<7>Deasserting INIT.
[17179693.420000] Waiting for send to finish...
[17179693.468000] +<7>#startup loops: 2.
[17179693.512000] Sending STARTUP #1.
[17179693.556000] After apic_write.



> 
> Index: linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 smpboot.c
> --- linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	11 Feb 2006 16:55:14 -0000	1.1.1.1
> +++ linux-2.6.16-rc2-mm1/arch/i386/kernel/smpboot.c	1 Mar 2006 06:30:06 -0000
> @@ -535,9 +535,14 @@ static void __devinit start_secondary(vo
>  	 * booting is too fragile that we want to limit the
>  	 * things done here to the most necessary things.
>  	 */
> +	Dprintk("S1\n");
>  	cpu_init();
> +	Dprintk("S2\n");
>  	preempt_disable();
> +	Dprintk("S3\n");
>  	smp_callin();
> +	Dprintk("S4\n");
> +
>  	while (!cpu_isset(smp_processor_id(), smp_commenced_mask))
>  		rep_nop();
>  	setup_secondary_APIC_clock();

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-03-06 13:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-19 23:58 i386 cpu hotplug bug - instant reboot when onlining secondary Nathan Lynch
2006-02-21 16:20 ` Zwane Mwaikambo
2006-02-27  7:50   ` Nathan Lynch
2006-02-28 15:40     ` Zwane Mwaikambo
2006-02-28 21:34       ` Nathan Lynch
2006-02-28 22:13         ` Zwane Mwaikambo
2006-03-01  3:28           ` Nathan Lynch
2006-03-01  6:31             ` Zwane Mwaikambo
2006-03-06 13:25               ` Nathan Lynch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox