public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* disabled APICs being counted as processors ?
@ 2014-01-23 22:13 Dave Jones
  2014-01-25  7:41 ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Jones @ 2014-01-23 22:13 UTC (permalink / raw)
  To: x86; +Cc: Linux Kernel

I have a system with 4 cores (configured with CONFIG_NR_CPUS=4) that shows during boot..

[    0.000000] smpboot: 8 Processors exceeds NR_CPUS limit of 4

it looks like this is because..

[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)

Should the CPU counting code be ignoring those disabled APICs ?

	Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-23 22:13 disabled APICs being counted as processors ? Dave Jones
@ 2014-01-25  7:41 ` Ingo Molnar
  2014-01-25 15:30   ` Dave Jones
  2014-01-25 16:42   ` Henrique de Moraes Holschuh
  0 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2014-01-25  7:41 UTC (permalink / raw)
  To: Dave Jones, x86, Linux Kernel


* Dave Jones <davej@redhat.com> wrote:

> I have a system with 4 cores (configured with CONFIG_NR_CPUS=4) that shows during boot..
> 
> [    0.000000] smpboot: 8 Processors exceeds NR_CPUS limit of 4
> 
> it looks like this is because..
> 
> [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
> 
> Should the CPU counting code be ignoring those disabled APICs ?

Hm, so to the kernel it looks like as if those were 'possible CPUs', 
in theory hotpluggable. Not sure what they are - disabled cores in an 
8-core system? Or BIOS reporting crap?

But perhaps the boot message could be improved to say something like:

> [    0.000000] smpboot: 8 possible processors exceeds NR_CPUS limit of 4

?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-25  7:41 ` Ingo Molnar
@ 2014-01-25 15:30   ` Dave Jones
  2014-01-26  6:41     ` David Rientjes
  2014-01-25 16:42   ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 11+ messages in thread
From: Dave Jones @ 2014-01-25 15:30 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: x86, Linux Kernel

On Sat, Jan 25, 2014 at 08:41:07AM +0100, Ingo Molnar wrote:
 > 
 > * Dave Jones <davej@redhat.com> wrote:
 > 
 > > I have a system with 4 cores (configured with CONFIG_NR_CPUS=4) that shows during boot..
 > > 
 > > [    0.000000] smpboot: 8 Processors exceeds NR_CPUS limit of 4
 > > 
 > > it looks like this is because..
 > > 
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
 > > [    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
 > > 
 > > Should the CPU counting code be ignoring those disabled APICs ?
 > 
 > Hm, so to the kernel it looks like as if those were 'possible CPUs', 
 > in theory hotpluggable. Not sure what they are - disabled cores in an 
 > 8-core system? Or BIOS reporting crap?
 > 
 > But perhaps the boot message could be improved to say something like:
 > 
 > > [    0.000000] smpboot: 8 possible processors exceeds NR_CPUS limit of 4

It's not possible though. It's an i5-4670T, in a single socket board.
It doesn't even have hyperthreading. http://ark.intel.com/products/75050/Intel-Core-i5-4670T-Processor-6M-Cache-up-to-3_30-GHz

	Dave

 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-25  7:41 ` Ingo Molnar
  2014-01-25 15:30   ` Dave Jones
@ 2014-01-25 16:42   ` Henrique de Moraes Holschuh
  1 sibling, 0 replies; 11+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-01-25 16:42 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Dave Jones, x86, Linux Kernel

On Sat, 25 Jan 2014, Ingo Molnar wrote:
> * Dave Jones <davej@redhat.com> wrote:
> > I have a system with 4 cores (configured with CONFIG_NR_CPUS=4) that shows during boot..
> > 
> > [    0.000000] smpboot: 8 Processors exceeds NR_CPUS limit of 4
> > 
> > it looks like this is because..
> > 
> > [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
> > [    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
> > 
> > Should the CPU counting code be ignoring those disabled APICs ?
> 
> Hm, so to the kernel it looks like as if those were 'possible CPUs', 
> in theory hotpluggable. Not sure what they are - disabled cores in an 
> 8-core system? Or BIOS reporting crap?

It is sort of a standard practice for the BIOS to report ACPI tables listing
all possible cores for the largest possible processor the motherboard
(sometimes several motherboards that share the same BIOS) could handle.
Really easy to find that quirk in anything by SuperMicro, for example.

The kernel things these extra CPU cores will show up eventually, and lists
them (and reserves resources to handle them) as hotpluggable CPUs.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-25 15:30   ` Dave Jones
@ 2014-01-26  6:41     ` David Rientjes
  2014-01-26  8:36       ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: David Rientjes @ 2014-01-26  6:41 UTC (permalink / raw)
  To: Dave Jones, Ingo Molnar, x86, Linux Kernel

On Sat, 25 Jan 2014, Dave Jones wrote:

>  > > it looks like this is because..
>  > > 
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
>  > > [    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
>  > > 
>  > > Should the CPU counting code be ignoring those disabled APICs ?
>  > 
>  > Hm, so to the kernel it looks like as if those were 'possible CPUs', 
>  > in theory hotpluggable. Not sure what they are - disabled cores in an 
>  > 8-core system? Or BIOS reporting crap?
>  > 
>  > But perhaps the boot message could be improved to say something like:
>  > 
>  > > [    0.000000] smpboot: 8 possible processors exceeds NR_CPUS limit of 4
> 
> It's not possible though. It's an i5-4670T, in a single socket board.
> It doesn't even have hyperthreading. http://ark.intel.com/products/75050/Intel-Core-i5-4670T-Processor-6M-Cache-up-to-3_30-GHz
> 

I don't think the "ACPI: LAPIC (... disabled)" lines are problematic, they 
are simply reporting the acpi processor id and apic id for processors that 
do not have their enabled flag set.  The acpi spec allows for these to 
exist without the enabled flag set when the processor isn't present at all 
because the kernel will make no attempt to use it.

That said, I think the "smpboot: 8 Processors exceeds NR_CPUS limit of 4" 
line is unnecessary since, as you said, these processors don't physically 
exist.  I betcha that's because you have CONFIG_HOTPLUG_CPU enabled and 
it's counting the disabled cpus that were found when acpi_register_lapic() 
was done.  The warning is only really meaningful for cpus in 
cpu_possible_map, which aren't set for your disabled four, in the hotplug 
case where NR_CPUS is too small.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-26  6:41     ` David Rientjes
@ 2014-01-26  8:36       ` Ingo Molnar
  2014-01-26  8:51         ` Yinghai Lu
  2014-01-26  9:23         ` David Rientjes
  0 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2014-01-26  8:36 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, x86, Linux Kernel, Yinghai Lu


* David Rientjes <rientjes@google.com> wrote:

> On Sat, 25 Jan 2014, Dave Jones wrote:
> 
> >  > > it looks like this is because..
> >  > > 
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0xff] disabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0xff] disabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0xff] disabled)
> >  > > [    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0xff] disabled)
> >  > > 
> >  > > Should the CPU counting code be ignoring those disabled APICs ?
> >  > 
> >  > Hm, so to the kernel it looks like as if those were 'possible CPUs', 
> >  > in theory hotpluggable. Not sure what they are - disabled cores in an 
> >  > 8-core system? Or BIOS reporting crap?
> >  > 
> >  > But perhaps the boot message could be improved to say something like:
> >  > 
> >  > > [    0.000000] smpboot: 8 possible processors exceeds NR_CPUS limit of 4
> > 
> > It's not possible though. It's an i5-4670T, in a single socket board.
> > It doesn't even have hyperthreading. http://ark.intel.com/products/75050/Intel-Core-i5-4670T-Processor-6M-Cache-up-to-3_30-GHz
> > 
> 
> I don't think the "ACPI: LAPIC (... disabled)" lines are problematic, they 
> are simply reporting the acpi processor id and apic id for processors that 
> do not have their enabled flag set.  The acpi spec allows for these to 
> exist without the enabled flag set when the processor isn't present at all 
> because the kernel will make no attempt to use it.
> 
> That said, I think the "smpboot: 8 Processors exceeds NR_CPUS limit 
> of 4" line is unnecessary since, as you said, these processors don't 
> physically exist.  I betcha that's because you have 
> CONFIG_HOTPLUG_CPU enabled and it's counting the disabled cpus that 
> were found when acpi_register_lapic() was done.  The warning is only 
> really meaningful for cpus in cpu_possible_map, which aren't set for 
> your disabled four, in the hotplug case where NR_CPUS is too small.

No, this message is printed in prefill_possible_map() which 
_generates_ cpu_possible_map, so '8' is the number of bits in 
cpu_possible_map.

So the problem is that the counting of disabled but hotpluggable CPUs 
is over-eager. Since I haven't actually seen _true_ hotplug CPU 
hardware yet, I'd argue we do the change below - allocating space for 
never-present CPUs is stupid. If there's true hot-plug CPUs around 
that could come online after we've booted, then we want to know about 
them explicitly.

Thoughts?

Thanks,

	Ingo

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index a32da80..75a351a 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1223,10 +1223,7 @@ __init void prefill_possible_map(void)
 	i = setup_max_cpus ?: 1;
 	if (setup_possible_cpus == -1) {
 		possible = num_processors;
-#ifdef CONFIG_HOTPLUG_CPU
-		if (setup_max_cpus)
-			possible += disabled_cpus;
-#else
+#ifndef CONFIG_HOTPLUG_CPU
 		if (possible > i)
 			possible = i;
 #endif

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-26  8:36       ` Ingo Molnar
@ 2014-01-26  8:51         ` Yinghai Lu
  2014-01-26  9:09           ` Ingo Molnar
  2014-01-26  9:23         ` David Rientjes
  1 sibling, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2014-01-26  8:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Rientjes, Dave Jones, the arch/x86 maintainers,
	Linux Kernel

On Sun, Jan 26, 2014 at 12:36 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> No, this message is printed in prefill_possible_map() which
> _generates_ cpu_possible_map, so '8' is the number of bits in
> cpu_possible_map.
>
> So the problem is that the counting of disabled but hotpluggable CPUs
> is over-eager. Since I haven't actually seen _true_ hotplug CPU
> hardware yet, I'd argue we do the change below - allocating space for
> never-present CPUs is stupid. If there's true hot-plug CPUs around
> that could come online after we've booted, then we want to know about
> them explicitly.
>
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index a32da80..75a351a 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1223,10 +1223,7 @@ __init void prefill_possible_map(void)
>         i = setup_max_cpus ?: 1;
>         if (setup_possible_cpus == -1) {
>                 possible = num_processors;
> -#ifdef CONFIG_HOTPLUG_CPU
> -               if (setup_max_cpus)
> -                       possible += disabled_cpus;
> -#else
> +#ifndef CONFIG_HOTPLUG_CPU
>                 if (possible > i)
>                         possible = i;
>  #endif

Agreed.

Most happens when one BIOS support different configuration.

Like system support 10 cores cpu and 15 cores cpus or system could be
configured with 4 sockets or 8 sockets

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-26  8:51         ` Yinghai Lu
@ 2014-01-26  9:09           ` Ingo Molnar
  0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2014-01-26  9:09 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: David Rientjes, Dave Jones, the arch/x86 maintainers,
	Linux Kernel


* Yinghai Lu <yinghai@kernel.org> wrote:

> On Sun, Jan 26, 2014 at 12:36 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > No, this message is printed in prefill_possible_map() which
> > _generates_ cpu_possible_map, so '8' is the number of bits in
> > cpu_possible_map.
> >
> > So the problem is that the counting of disabled but hotpluggable CPUs
> > is over-eager. Since I haven't actually seen _true_ hotplug CPU
> > hardware yet, I'd argue we do the change below - allocating space for
> > never-present CPUs is stupid. If there's true hot-plug CPUs around
> > that could come online after we've booted, then we want to know about
> > them explicitly.
> >
> >
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index a32da80..75a351a 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -1223,10 +1223,7 @@ __init void prefill_possible_map(void)
> >         i = setup_max_cpus ?: 1;
> >         if (setup_possible_cpus == -1) {
> >                 possible = num_processors;
> > -#ifdef CONFIG_HOTPLUG_CPU
> > -               if (setup_max_cpus)
> > -                       possible += disabled_cpus;
> > -#else
> > +#ifndef CONFIG_HOTPLUG_CPU
> >                 if (possible > i)
> >                         possible = i;
> >  #endif
> 
> Agreed.

A question would be kexec and virtualization: do any of those variants 
boot a kernel with 'disabled but working' CPUs, which could be 
hot-onlined later on?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-26  8:36       ` Ingo Molnar
  2014-01-26  8:51         ` Yinghai Lu
@ 2014-01-26  9:23         ` David Rientjes
  2014-01-26  9:29           ` Ingo Molnar
  1 sibling, 1 reply; 11+ messages in thread
From: David Rientjes @ 2014-01-26  9:23 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Dave Jones, x86, Linux Kernel, Yinghai Lu

On Sun, 26 Jan 2014, Ingo Molnar wrote:

> > I don't think the "ACPI: LAPIC (... disabled)" lines are problematic, they 
> > are simply reporting the acpi processor id and apic id for processors that 
> > do not have their enabled flag set.  The acpi spec allows for these to 
> > exist without the enabled flag set when the processor isn't present at all 
> > because the kernel will make no attempt to use it.
> > 
> > That said, I think the "smpboot: 8 Processors exceeds NR_CPUS limit 
> > of 4" line is unnecessary since, as you said, these processors don't 
> > physically exist.  I betcha that's because you have 
> > CONFIG_HOTPLUG_CPU enabled and it's counting the disabled cpus that 
> > were found when acpi_register_lapic() was done.  The warning is only 
> > really meaningful for cpus in cpu_possible_map, which aren't set for 
> > your disabled four, in the hotplug case where NR_CPUS is too small.
> 
> No, this message is printed in prefill_possible_map() which 
> _generates_ cpu_possible_map, so '8' is the number of bits in 
> cpu_possible_map.
> 

Yeah, because I bet Dave has CONFIG_HOTPLUG_CPU enabled and it's adding 
this to the number of possible cpus when in reality, per the spec, these 
cpus aren't possible at all because their enable bit isn't set in their 
lapic flags.

> So the problem is that the counting of disabled but hotpluggable CPUs 
> is over-eager.

In the kernel, yeah, and we don't distinguish between physically absent 
processors that have lapic entries and physically present but disabled 
processors.

> Since I haven't actually seen _true_ hotplug CPU 
> hardware yet, I'd argue we do the change below - allocating space for 
> never-present CPUs is stupid. If there's true hot-plug CPUs around 
> that could come online after we've booted, then we want to know about 
> them explicitly.
> 
> Thoughts?
> 
> Thanks,
> 
> 	Ingo
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index a32da80..75a351a 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1223,10 +1223,7 @@ __init void prefill_possible_map(void)
>  	i = setup_max_cpus ?: 1;
>  	if (setup_possible_cpus == -1) {
>  		possible = num_processors;
> -#ifdef CONFIG_HOTPLUG_CPU
> -		if (setup_max_cpus)
> -			possible += disabled_cpus;
> -#else
> +#ifndef CONFIG_HOTPLUG_CPU
>  		if (possible > i)
>  			possible = i;
>  #endif

Yeah, this should suppress the warning for Dave.  This way, the only way 
the log reports the number of "hotplug CPUs" is because we used 
possible_cpus.

I think you should also just do "total_cpus = possible" though and forget 
about disabled_cpus or /sys/devices/system/cpu/offline is still going to 
show him 4-7.

This function could benefit from a cleanup at the same time, it's not 
looking good:

 - "i" is a horribly named variable that stores the value so at least
   one cpu is possible when "nosmp" is used,

 - what's with the

   #ifdef CONFIG_HOTPLUG_CPU
	if (!setup_max_cpus)
   #endif ?

   if I do "maxcpus=4 nr_cpus=6 possible_cpus=8" what's the expected
   behavior?  We're not only testing for "nosmp" use here, "possible"
   should still be 4, and

 - the warning references "max_cpus" but the kernel command line option
   is "maxcpus"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-26  9:23         ` David Rientjes
@ 2014-01-26  9:29           ` Ingo Molnar
  2014-01-26  9:44             ` David Rientjes
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2014-01-26  9:29 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, x86, Linux Kernel, Yinghai Lu


* David Rientjes <rientjes@google.com> wrote:

> On Sun, 26 Jan 2014, Ingo Molnar wrote:
> 
> > > I don't think the "ACPI: LAPIC (... disabled)" lines are problematic, they 
> > > are simply reporting the acpi processor id and apic id for processors that 
> > > do not have their enabled flag set.  The acpi spec allows for these to 
> > > exist without the enabled flag set when the processor isn't present at all 
> > > because the kernel will make no attempt to use it.
> > > 
> > > That said, I think the "smpboot: 8 Processors exceeds NR_CPUS limit 
> > > of 4" line is unnecessary since, as you said, these processors don't 
> > > physically exist.  I betcha that's because you have 
> > > CONFIG_HOTPLUG_CPU enabled and it's counting the disabled cpus that 
> > > were found when acpi_register_lapic() was done.  The warning is only 
> > > really meaningful for cpus in cpu_possible_map, which aren't set for 
> > > your disabled four, in the hotplug case where NR_CPUS is too small.
> > 
> > No, this message is printed in prefill_possible_map() which 
> > _generates_ cpu_possible_map, so '8' is the number of bits in 
> > cpu_possible_map.
> > 
> 
> Yeah, because I bet Dave has CONFIG_HOTPLUG_CPU enabled and it's adding 
> this to the number of possible cpus when in reality, per the spec, these 
> cpus aren't possible at all because their enable bit isn't set in their 
> lapic flags.

Yeah, I suspect Dave has a distro-ish .config on his desktop, and 
distros generally enable all things hot-plug.

> > So the problem is that the counting of disabled but hotpluggable 
> > CPUs is over-eager.
> 
> In the kernel, yeah, and we don't distinguish between physically 
> absent processors that have lapic entries and physically present but 
> disabled processors.

Correct. Is there a robust distinction possible between the two?

> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -1223,10 +1223,7 @@ __init void prefill_possible_map(void)
> >  	i = setup_max_cpus ?: 1;
> >  	if (setup_possible_cpus == -1) {
> >  		possible = num_processors;
> > -#ifdef CONFIG_HOTPLUG_CPU
> > -		if (setup_max_cpus)
> > -			possible += disabled_cpus;
> > -#else
> > +#ifndef CONFIG_HOTPLUG_CPU
> >  		if (possible > i)
> >  			possible = i;
> >  #endif
> 
> Yeah, this should suppress the warning for Dave.  This way, the only way 
> the log reports the number of "hotplug CPUs" is because we used 
> possible_cpus.

Not just that, it also reduces the number of possible CPUs, which 
should reduce percpu memory allocation overhead, amongst other things, 
right?

> I think you should also just do "total_cpus = possible" though and 
> forget about disabled_cpus or /sys/devices/system/cpu/offline is 
> still going to show him 4-7.

Agreed.

> This function could benefit from a cleanup at the same time, it's 
> not looking good:
> 
>  - "i" is a horribly named variable that stores the value so at least
>    one cpu is possible when "nosmp" is used,
> 
>  - what's with the
> 
>    #ifdef CONFIG_HOTPLUG_CPU
> 	if (!setup_max_cpus)
>    #endif ?
> 
>    if I do "maxcpus=4 nr_cpus=6 possible_cpus=8" what's the expected
>    behavior?  We're not only testing for "nosmp" use here, "possible"
>    should still be 4, and
> 
>  - the warning references "max_cpus" but the kernel command line option
>    is "maxcpus"

Ack.

I wouldn't object to someone sending a changelogged, tested patch that 
does all that. Maybe two patches: first the cleanups, then the CPU 
count trimming. Just in case it regresses ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: disabled APICs being counted as processors ?
  2014-01-26  9:29           ` Ingo Molnar
@ 2014-01-26  9:44             ` David Rientjes
  0 siblings, 0 replies; 11+ messages in thread
From: David Rientjes @ 2014-01-26  9:44 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Dave Jones, x86, Linux Kernel, Yinghai Lu

On Sun, 26 Jan 2014, Ingo Molnar wrote:

> > > So the problem is that the counting of disabled but hotpluggable 
> > > CPUs is over-eager.
> > 
> > In the kernel, yeah, and we don't distinguish between physically 
> > absent processors that have lapic entries and physically present but 
> > disabled processors.
> 
> Correct. Is there a robust distinction possible between the two?
> 

Not with acpi, I'm afraid, which allows for both possibilities to either 
have no lapic entry or have ACPI_MADT_ENABLED clear.

> > > --- a/arch/x86/kernel/smpboot.c
> > > +++ b/arch/x86/kernel/smpboot.c
> > > @@ -1223,10 +1223,7 @@ __init void prefill_possible_map(void)
> > >  	i = setup_max_cpus ?: 1;
> > >  	if (setup_possible_cpus == -1) {
> > >  		possible = num_processors;
> > > -#ifdef CONFIG_HOTPLUG_CPU
> > > -		if (setup_max_cpus)
> > > -			possible += disabled_cpus;
> > > -#else
> > > +#ifndef CONFIG_HOTPLUG_CPU
> > >  		if (possible > i)
> > >  			possible = i;
> > >  #endif
> > 
> > Yeah, this should suppress the warning for Dave.  This way, the only way 
> > the log reports the number of "hotplug CPUs" is because we used 
> > possible_cpus.
> 
> Not just that, it also reduces the number of possible CPUs, which 
> should reduce percpu memory allocation overhead, amongst other things, 
> right?
> 

Indeed, it gives people a good motivation for clearing out those 
unnecessary lapic entries :)

> I wouldn't object to someone sending a changelogged, tested patch that 
> does all that. Maybe two patches: first the cleanups, then the CPU 
> count trimming. Just in case it regresses ...
> 

Sounds good.  I need to look into your point about kexec as far as the 
possible count trimming first for the second patch.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-01-26  9:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-23 22:13 disabled APICs being counted as processors ? Dave Jones
2014-01-25  7:41 ` Ingo Molnar
2014-01-25 15:30   ` Dave Jones
2014-01-26  6:41     ` David Rientjes
2014-01-26  8:36       ` Ingo Molnar
2014-01-26  8:51         ` Yinghai Lu
2014-01-26  9:09           ` Ingo Molnar
2014-01-26  9:23         ` David Rientjes
2014-01-26  9:29           ` Ingo Molnar
2014-01-26  9:44             ` David Rientjes
2014-01-25 16:42   ` Henrique de Moraes Holschuh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox