public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* for_each_online_cpu broken ?
@ 2005-12-08  5:07 Dave Jones
  2005-12-08  5:26 ` Andi Kleen
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Jones @ 2005-12-08  5:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: ak

Whilst debugging a memory leak, I hit sysrq meminfo,
and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus
I fixed a bug recently in mm/page_alloc.c to change this from
a for_each_cpu to a for_each_online_cpu and I'm pretty certain
I tested that it worked, but for reasons unknown, it's now
misbehaving again.

I've only tried reproducing this on x86-64 so far.

		Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  5:07 for_each_online_cpu broken ? Dave Jones
@ 2005-12-08  5:26 ` Andi Kleen
  2005-12-08  5:33   ` Dave Jones
  0 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2005-12-08  5:26 UTC (permalink / raw)
  To: Dave Jones, linux-kernel, ak

On Thu, Dec 08, 2005 at 12:07:39AM -0500, Dave Jones wrote:
> Whilst debugging a memory leak, I hit sysrq meminfo,
> and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus
> I fixed a bug recently in mm/page_alloc.c to change this from
> a for_each_cpu to a for_each_online_cpu and I'm pretty certain
> I tested that it worked, but for reasons unknown, it's now
> misbehaving again.
> 
> I've only tried reproducing this on x86-64 so far.

If the online map is wrong all kinds of things would go wrong.

Most likely your kernel doesn't have the fix.

The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect
of CPU hotplug, which now uses a better algorithm to guess the 
number of possible CPUs. In 2.6.15 you will just get half the number
of available CPUs in addition by default

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  5:26 ` Andi Kleen
@ 2005-12-08  5:33   ` Dave Jones
  2005-12-08  5:38     ` David S. Miller
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Dave Jones @ 2005-12-08  5:33 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote:

Hi Andi,

 > > Whilst debugging a memory leak, I hit sysrq meminfo,
 > > and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus
 > > 
 > > I've only tried reproducing this on x86-64 so far.
 > 
 > If the online map is wrong all kinds of things would go wrong.
 > 
 > Most likely your kernel doesn't have the fix.

This was seen with a .15rc5-git1 kernel.
Is this something still living in your x86-64 patchset or -mm ?
 
 > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect
 > of CPU hotplug, which now uses a better algorithm to guess the 
 > number of possible CPUs. In 2.6.15 you will just get half the number
 > of available CPUs in addition by default

Yep, I noticed it offers a maximum of 6 cpus on my way.
As a sidenote, seems kinda funny (and wasteful maybe?), doing this
on a lot of hardware that isn't hotplug capable. (Whilst I could
disable cpu hotplug in my local build, this isn't an answer for
a generic distro kernel).

		Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  5:33   ` Dave Jones
@ 2005-12-08  5:38     ` David S. Miller
  2005-12-08  6:12       ` Andi Kleen
  2005-12-08  6:22     ` Nigel Cunningham
  2005-12-08  6:30     ` Andi Kleen
  2 siblings, 1 reply; 12+ messages in thread
From: David S. Miller @ 2005-12-08  5:38 UTC (permalink / raw)
  To: davej; +Cc: ak, linux-kernel

From: Dave Jones <davej@redhat.com>
Date: Thu, 8 Dec 2005 00:33:02 -0500

> On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote:
> 
>  > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect
>  > of CPU hotplug, which now uses a better algorithm to guess the 
>  > number of possible CPUs. In 2.6.15 you will just get half the number
>  > of available CPUs in addition by default
> 
> Yep, I noticed it offers a maximum of 6 cpus on my way.
> As a sidenote, seems kinda funny (and wasteful maybe?), doing this
> on a lot of hardware that isn't hotplug capable. (Whilst I could
> disable cpu hotplug in my local build, this isn't an answer for
> a generic distro kernel).

This can be dangerous btw, as some subsystem such as netfilter
allocate enormous datastructures based upon the largest possible
cpu number in the system.

In 2.6.16 it will use something a bit more intelligent, but
overestimating the possible cpu set can be quite a waste.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  5:38     ` David S. Miller
@ 2005-12-08  6:12       ` Andi Kleen
  2005-12-08  6:27         ` Dave Jones
  0 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2005-12-08  6:12 UTC (permalink / raw)
  To: David S. Miller; +Cc: davej, ak, linux-kernel

On Wed, Dec 07, 2005 at 09:38:25PM -0800, David S. Miller wrote:
> From: Dave Jones <davej@redhat.com>
> Date: Thu, 8 Dec 2005 00:33:02 -0500
> 
> > On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote:
> > 
> >  > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect
> >  > of CPU hotplug, which now uses a better algorithm to guess the 
> >  > number of possible CPUs. In 2.6.15 you will just get half the number
> >  > of available CPUs in addition by default
> > 
> > Yep, I noticed it offers a maximum of 6 cpus on my way.
> > As a sidenote, seems kinda funny (and wasteful maybe?), doing this
> > on a lot of hardware that isn't hotplug capable. (Whilst I could
> > disable cpu hotplug in my local build, this isn't an answer for
> > a generic distro kernel).

If you can figure out a way to detect this please share.
The ACPI designers unfortunately didn't think that far
(they did it right for memory hotplug, but not for CPU) 

I invented an ACPI extensin for it, but it's non standard
so the half of CPUs is used as a default unless overwritten
(additional_cpus=NUM) 

Anyways I changed it earlier to 1 additional CPU by default.

> 
> This can be dangerous btw, as some subsystem such as netfilter
> allocate enormous datastructures based upon the largest possible
> cpu number in the system.

It is a big improvement in fact. Previously with NR_CPUS==128 
(default on some distros) and CPU_HOTPLUG enabled it would
always allocate several hundred KB of just standard 
per cpu data unnecessarily.

-Andi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  5:33   ` Dave Jones
  2005-12-08  5:38     ` David S. Miller
@ 2005-12-08  6:22     ` Nigel Cunningham
  2005-12-08  6:28       ` Dave Jones
  2005-12-08  6:30     ` Andi Kleen
  2 siblings, 1 reply; 12+ messages in thread
From: Nigel Cunningham @ 2005-12-08  6:22 UTC (permalink / raw)
  To: Dave Jones; +Cc: Andi Kleen, Linux Kernel Mailing List

Hi.

On Thu, 2005-12-08 at 15:33, Dave Jones wrote:
> On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote:
> 
> Hi Andi,
> 
>  > > Whilst debugging a memory leak, I hit sysrq meminfo,
>  > > and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus
>  > > 
>  > > I've only tried reproducing this on x86-64 so far.
>  > 
>  > If the online map is wrong all kinds of things would go wrong.
>  > 
>  > Most likely your kernel doesn't have the fix.
> 
> This was seen with a .15rc5-git1 kernel.
> Is this something still living in your x86-64 patchset or -mm ?
>  
>  > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect
>  > of CPU hotplug, which now uses a better algorithm to guess the 
>  > number of possible CPUs. In 2.6.15 you will just get half the number
>  > of available CPUs in addition by default
> 
> Yep, I noticed it offers a maximum of 6 cpus on my way.
> As a sidenote, seems kinda funny (and wasteful maybe?), doing this
> on a lot of hardware that isn't hotplug capable. (Whilst I could
> disable cpu hotplug in my local build, this isn't an answer for
> a generic distro kernel).

Both suspend to disk (and suspend to ram?) implementations now depend on
hotplug_cpu to enable extra cpus, so there is at least one reason for
them to want hotplug support in a generic kernel.

Regards,

Nigel


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  6:12       ` Andi Kleen
@ 2005-12-08  6:27         ` Dave Jones
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2005-12-08  6:27 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, linux-kernel

On Thu, Dec 08, 2005 at 07:12:12AM +0100, Andi Kleen wrote:
 > On Wed, Dec 07, 2005 at 09:38:25PM -0800, David S. Miller wrote:
 > > From: Dave Jones <davej@redhat.com>
 > > Date: Thu, 8 Dec 2005 00:33:02 -0500
 > > 
 > > > On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote:
 > > > 
 > > >  > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect
 > > >  > of CPU hotplug, which now uses a better algorithm to guess the 
 > > >  > number of possible CPUs. In 2.6.15 you will just get half the number
 > > >  > of available CPUs in addition by default
 > > > 
 > > > Yep, I noticed it offers a maximum of 6 cpus on my way.
 > > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this
 > > > on a lot of hardware that isn't hotplug capable. (Whilst I could
 > > > disable cpu hotplug in my local build, this isn't an answer for
 > > > a generic distro kernel).
 > 
 > If you can figure out a way to detect this please share.
 > The ACPI designers unfortunately didn't think that far
 > (they did it right for memory hotplug, but not for CPU) 
 > 
 > I invented an ACPI extensin for it, but it's non standard
 > so the half of CPUs is used as a default unless overwritten
 > (additional_cpus=NUM) 
 > 
 > Anyways I changed it earlier to 1 additional CPU by default.

Just guessing seems to be pretty guaranteed to give the wrong answer.
I think it makes more sense to say "if your BIOS doesn't give
the relevant info (as is usually the case), boot with additional_cpus)

Penalising the many for the needs of the few just seems wrong.
		
		Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  6:22     ` Nigel Cunningham
@ 2005-12-08  6:28       ` Dave Jones
  2005-12-08  6:30         ` Andi Kleen
  2005-12-09  0:03         ` Nigel Cunningham
  0 siblings, 2 replies; 12+ messages in thread
From: Dave Jones @ 2005-12-08  6:28 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Andi Kleen, Linux Kernel Mailing List

On Thu, Dec 08, 2005 at 04:22:05PM +1000, Nigel Cunningham wrote:

 > > Yep, I noticed it offers a maximum of 6 cpus on my way.
 > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this
 > > on a lot of hardware that isn't hotplug capable. (Whilst I could
 > > disable cpu hotplug in my local build, this isn't an answer for
 > > a generic distro kernel).
 > 
 > Both suspend to disk (and suspend to ram?) implementations now depend on
 > hotplug_cpu to enable extra cpus, so there is at least one reason for
 > them to want hotplug support in a generic kernel.

You mean suspend -> plug in a new cpu -> resume transitions ?
That sounds *terrifying*.

		Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  6:28       ` Dave Jones
@ 2005-12-08  6:30         ` Andi Kleen
  2005-12-09  0:03         ` Nigel Cunningham
  1 sibling, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2005-12-08  6:30 UTC (permalink / raw)
  To: Dave Jones, Nigel Cunningham, Andi Kleen,
	Linux Kernel Mailing List

On Thu, Dec 08, 2005 at 01:28:44AM -0500, Dave Jones wrote:
> On Thu, Dec 08, 2005 at 04:22:05PM +1000, Nigel Cunningham wrote:
> 
>  > > Yep, I noticed it offers a maximum of 6 cpus on my way.
>  > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this
>  > > on a lot of hardware that isn't hotplug capable. (Whilst I could
>  > > disable cpu hotplug in my local build, this isn't an answer for
>  > > a generic distro kernel).
>  > 
>  > Both suspend to disk (and suspend to ram?) implementations now depend on
>  > hotplug_cpu to enable extra cpus, so there is at least one reason for
>  > them to want hotplug support in a generic kernel.
> 
> You mean suspend -> plug in a new cpu -> resume transitions ?
> That sounds *terrifying*.

No, they unplug all non BP CPUs before suspending. 
It obviously doesn't require additional CPUs though.


-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  5:33   ` Dave Jones
  2005-12-08  5:38     ` David S. Miller
  2005-12-08  6:22     ` Nigel Cunningham
@ 2005-12-08  6:30     ` Andi Kleen
  2005-12-08  6:43       ` Dave Jones
  2 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2005-12-08  6:30 UTC (permalink / raw)
  To: Dave Jones, Andi Kleen, linux-kernel

> This was seen with a .15rc5-git1 kernel.
> Is this something still living in your x86-64 patchset or -mm ?

Only the change for less additional CPUs. 

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  6:30     ` Andi Kleen
@ 2005-12-08  6:43       ` Dave Jones
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2005-12-08  6:43 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

On Thu, Dec 08, 2005 at 07:30:50AM +0100, Andi Kleen wrote:
 > > This was seen with a .15rc5-git1 kernel.
 > > Is this something still living in your x86-64 patchset or -mm ?
 > 
 > Only the change for less additional CPUs. 

Ok, then I'm at a loss why it's dumping state for NR_CPUS cpus.
I'm away tomorrow, but I'll poke at it some more on Friday.

		Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: for_each_online_cpu broken ?
  2005-12-08  6:28       ` Dave Jones
  2005-12-08  6:30         ` Andi Kleen
@ 2005-12-09  0:03         ` Nigel Cunningham
  1 sibling, 0 replies; 12+ messages in thread
From: Nigel Cunningham @ 2005-12-09  0:03 UTC (permalink / raw)
  To: Dave Jones; +Cc: Andi Kleen, Linux Kernel Mailing List

Hi.

On Thu, 2005-12-08 at 16:28, Dave Jones wrote:
> On Thu, Dec 08, 2005 at 04:22:05PM +1000, Nigel Cunningham wrote:
> 
>  > > Yep, I noticed it offers a maximum of 6 cpus on my way.
>  > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this
>  > > on a lot of hardware that isn't hotplug capable. (Whilst I could
>  > > disable cpu hotplug in my local build, this isn't an answer for
>  > > a generic distro kernel).
>  > 
>  > Both suspend to disk (and suspend to ram?) implementations now depend on
>  > hotplug_cpu to enable extra cpus, so there is at least one reason for
>  > them to want hotplug support in a generic kernel.
> 
> You mean suspend -> plug in a new cpu -> resume transitions ?
> That sounds *terrifying*.

Andi is right, it's just a logical unplug. But having said that, I
suppose extra cpus could be plugged/unplugged during a suspend to disk.
Not that I've ever tried it. I have a real SMP mobo, but haven't had the
opportunity to fire it up.

Regards,

Nigel


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2005-12-09  0:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-08  5:07 for_each_online_cpu broken ? Dave Jones
2005-12-08  5:26 ` Andi Kleen
2005-12-08  5:33   ` Dave Jones
2005-12-08  5:38     ` David S. Miller
2005-12-08  6:12       ` Andi Kleen
2005-12-08  6:27         ` Dave Jones
2005-12-08  6:22     ` Nigel Cunningham
2005-12-08  6:28       ` Dave Jones
2005-12-08  6:30         ` Andi Kleen
2005-12-09  0:03         ` Nigel Cunningham
2005-12-08  6:30     ` Andi Kleen
2005-12-08  6:43       ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox