public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] [patch] logical CPU numbering
@ 2003-03-20 19:03 Martin Hicks
  2003-03-20 20:07 ` Wichmann, Mats D
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Martin Hicks @ 2003-03-20 19:03 UTC (permalink / raw)
  To: linux-ia64

Hello,

If a CPU fails to start during smp_boot_cpus(), then the logical CPU
numbering gets will have a "hole".  Using the number of booted CPU's
instead of the loop index will correct this.

This patch is against 2.4.21-pre5.

thanks,
mh

-- 
Wild Open Source Inc.                  mort@wildopensource.com



--- linux-2.4.21-pre5-ia64-030312.pristine/arch/ia64/kernel/smpboot.c	Sun Mar 16 10:18:53 2003
+++ linux-2.4.21-pre5-ia64-030312/arch/ia64/kernel/smpboot.c	Thu Mar 20 10:47:07 2003
@@ -522,7 +522,7 @@
 			/*
 			 * Make sure we unmap all failed CPUs
 			 */
-			if (ia64_cpu_to_sapicid[cpu] = -1)
+			if (ia64_cpu_to_sapicid[cpucount] = -1)
 				printk("phys CPU#%d not responding - cannot use it.\n", cpu);
 		}
 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [Linux-ia64] [patch] logical CPU numbering
  2003-03-20 19:03 [Linux-ia64] [patch] logical CPU numbering Martin Hicks
@ 2003-03-20 20:07 ` Wichmann, Mats D
  2003-03-25 18:03 ` Martin Hicks
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Wichmann, Mats D @ 2003-03-20 20:07 UTC (permalink / raw)
  To: linux-ia64

> If a CPU fails to start during smp_boot_cpus(), then the logical CPU
> numbering gets will have a "hole".  Using the number of booted CPU's
> instead of the loop index will correct this.

Just curious, does having a hole cause a problem?
It might well; I don't know.

I consider some future scenario where if cpu X
didn't come up, it can be hot-replaced, at which
point one might actually want to have that "slot"
reserved for it. 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] [patch] logical CPU numbering
  2003-03-20 19:03 [Linux-ia64] [patch] logical CPU numbering Martin Hicks
  2003-03-20 20:07 ` Wichmann, Mats D
@ 2003-03-25 18:03 ` Martin Hicks
  2003-04-03 18:31 ` Martin Hicks
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Martin Hicks @ 2003-03-25 18:03 UTC (permalink / raw)
  To: linux-ia64


On Thu, Mar 20, 2003 at 12:07:32PM -0800, Wichmann, Mats D wrote:
> 
> > If a CPU fails to start during smp_boot_cpus(), then the logical CPU
> > numbering gets will have a "hole".  Using the number of booted CPU's
> > instead of the loop index will correct this.
> 
> Just curious, does having a hole cause a problem?
> It might well; I don't know.
> 
> I consider some future scenario where if cpu X
> didn't come up, it can be hot-replaced, at which
> point one might actually want to have that "slot"
> reserved for it. 

I thought there was some code that depended on having logical cpus
numbered without holes.  I can't find any evidence of it anymore, so
perhaps its not a bug.

mh

-- 
Wild Open Source Inc.                  mort@wildopensource.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] [patch] logical CPU numbering
  2003-03-20 19:03 [Linux-ia64] [patch] logical CPU numbering Martin Hicks
  2003-03-20 20:07 ` Wichmann, Mats D
  2003-03-25 18:03 ` Martin Hicks
@ 2003-04-03 18:31 ` Martin Hicks
  2003-04-17 19:13 ` Bjorn Helgaas
  2003-04-17 20:25 ` Martin Hicks
  4 siblings, 0 replies; 6+ messages in thread
From: Martin Hicks @ 2003-04-03 18:31 UTC (permalink / raw)
  To: linux-ia64

On Thu, Mar 20, 2003 at 02:03:57PM -0500, Martin Hicks wrote:
> 
> Hello,
> 
> If a CPU fails to start during smp_boot_cpus(), then the logical CPU
> numbering gets will have a "hole".  Using the number of booted CPU's
> instead of the loop index will correct this.
> 
> This patch is against 2.4.21-pre5.

This patch should be applied.  I finally got around to doing some more
testing with it.  If a CPU fails to start, currently we get messages
like the following for subsequent CPU's:


CPU 17: nasid 18, slice 0, cnode 9
CPU 17: base freq 0.000MHz, ITC ratio\x10/2, ITC freq\x1000.000MHz
Calibrating delay loop... 1494.72 BogoMIPS
phys CPU#17 (0x12) not responding - cannot use it.               <<-BOGUS


The patch below fixes this problem.  This is the same patch as before,
reposted just to make things easier.  It is against
2.4.21-pre5-ia64-0303012.

Thanks,
mh

-- 
Wild Open Source Inc.                  mort@wildopensource.com


--- linux-2.4.21-pre5-ia64-030312.pristine/arch/ia64/kernel/smpboot.c	Sun Mar 16 10:18:53 2003
+++ linux-2.4.21-pre5-ia64-030312/arch/ia64/kernel/smpboot.c	Thu Mar 20 10:47:07 2003
@@ -522,7 +522,7 @@
 			/*
 			 * Make sure we unmap all failed CPUs
 			 */
-			if (ia64_cpu_to_sapicid[cpu] = -1)
+			if (ia64_cpu_to_sapicid[cpucount] = -1)
 				printk("phys CPU#%d not responding - cannot use it.\n", cpu);
 		}
 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] [patch] logical CPU numbering
  2003-03-20 19:03 [Linux-ia64] [patch] logical CPU numbering Martin Hicks
                   ` (2 preceding siblings ...)
  2003-04-03 18:31 ` Martin Hicks
@ 2003-04-17 19:13 ` Bjorn Helgaas
  2003-04-17 20:25 ` Martin Hicks
  4 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2003-04-17 19:13 UTC (permalink / raw)
  To: linux-ia64

On Thursday 03 April 2003 11:31 am, Martin Hicks wrote:
> This patch should be applied.  I finally got around to doing some more
> testing with it.  If a CPU fails to start, currently we get messages
> like the following for subsequent CPU's:
> 
> 
> CPU 17: nasid 18, slice 0, cnode 9
> CPU 17: base freq 0.000MHz, ITC ratio\x10/2, ITC freq\x1000.000MHz
> Calibrating delay loop... 1494.72 BogoMIPS
> phys CPU#17 (0x12) not responding - cannot use it.               <<-BOGUS

I don't quite understand how this works.  The current code is clearly
wrong, but if the AP in a 2-CPU system fails to start, won't the new
code print "phys CPU#0 not responding"?  That doesn't seem accurate.

If we really need this printk, it seems like the logical place to put it
would be in do_boot_cpu(), where we already print the "Processor X/Y
is stuck" message.  2.5 seems to have just removed the "not responding"
printk, though, and I'd be inclined to do the same.  Any objections?

> --- linux-2.4.21-pre5-ia64-030312.pristine/arch/ia64/kernel/smpboot.c	Sun Mar 16 10:18:53 2003
> +++ linux-2.4.21-pre5-ia64-030312/arch/ia64/kernel/smpboot.c	Thu Mar 20 10:47:07 2003
> @@ -522,7 +522,7 @@
>  			/*
>  			 * Make sure we unmap all failed CPUs
>  			 */
> -			if (ia64_cpu_to_sapicid[cpu] = -1)
> +			if (ia64_cpu_to_sapicid[cpucount] = -1)
>  				printk("phys CPU#%d not responding - cannot use it.\n", cpu);
>  		}
>  
> 
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Linux-ia64] [patch] logical CPU numbering
  2003-03-20 19:03 [Linux-ia64] [patch] logical CPU numbering Martin Hicks
                   ` (3 preceding siblings ...)
  2003-04-17 19:13 ` Bjorn Helgaas
@ 2003-04-17 20:25 ` Martin Hicks
  4 siblings, 0 replies; 6+ messages in thread
From: Martin Hicks @ 2003-04-17 20:25 UTC (permalink / raw)
  To: linux-ia64


On Thu, Apr 17, 2003 at 01:13:44PM -0600, Bjorn Helgaas wrote:
> On Thursday 03 April 2003 11:31 am, Martin Hicks wrote:
> > This patch should be applied.  I finally got around to doing some more
> > testing with it.  If a CPU fails to start, currently we get messages
> > like the following for subsequent CPU's:
> > 
> > 
> > CPU 17: nasid 18, slice 0, cnode 9
> > CPU 17: base freq 0.000MHz, ITC ratio\x10/2, ITC freq\x1000.000MHz
> > Calibrating delay loop... 1494.72 BogoMIPS
> > phys CPU#17 (0x12) not responding - cannot use it.               <<-BOGUS
> 
> I don't quite understand how this works.  The current code is clearly
> wrong, but if the AP in a 2-CPU system fails to start, won't the new
> code print "phys CPU#0 not responding"?  That doesn't seem accurate.
> 
> If we really need this printk, it seems like the logical place to put it
> would be in do_boot_cpu(), where we already print the "Processor X/Y
> is stuck" message.  2.5 seems to have just removed the "not responding"
> printk, though, and I'd be inclined to do the same.  Any objections?

I have no objections.

mh

-- 
Wild Open Source Inc.                  mort@wildopensource.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-04-17 20:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-20 19:03 [Linux-ia64] [patch] logical CPU numbering Martin Hicks
2003-03-20 20:07 ` Wichmann, Mats D
2003-03-25 18:03 ` Martin Hicks
2003-04-03 18:31 ` Martin Hicks
2003-04-17 19:13 ` Bjorn Helgaas
2003-04-17 20:25 ` Martin Hicks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox