* for_each_online_cpu broken ? @ 2005-12-08 5:07 Dave Jones 2005-12-08 5:26 ` Andi Kleen 0 siblings, 1 reply; 12+ messages in thread From: Dave Jones @ 2005-12-08 5:07 UTC (permalink / raw) To: linux-kernel; +Cc: ak Whilst debugging a memory leak, I hit sysrq meminfo, and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus I fixed a bug recently in mm/page_alloc.c to change this from a for_each_cpu to a for_each_online_cpu and I'm pretty certain I tested that it worked, but for reasons unknown, it's now misbehaving again. I've only tried reproducing this on x86-64 so far. Dave ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 5:07 for_each_online_cpu broken ? Dave Jones @ 2005-12-08 5:26 ` Andi Kleen 2005-12-08 5:33 ` Dave Jones 0 siblings, 1 reply; 12+ messages in thread From: Andi Kleen @ 2005-12-08 5:26 UTC (permalink / raw) To: Dave Jones, linux-kernel, ak On Thu, Dec 08, 2005 at 12:07:39AM -0500, Dave Jones wrote: > Whilst debugging a memory leak, I hit sysrq meminfo, > and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus > I fixed a bug recently in mm/page_alloc.c to change this from > a for_each_cpu to a for_each_online_cpu and I'm pretty certain > I tested that it worked, but for reasons unknown, it's now > misbehaving again. > > I've only tried reproducing this on x86-64 so far. If the online map is wrong all kinds of things would go wrong. Most likely your kernel doesn't have the fix. The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect of CPU hotplug, which now uses a better algorithm to guess the number of possible CPUs. In 2.6.15 you will just get half the number of available CPUs in addition by default -Andi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 5:26 ` Andi Kleen @ 2005-12-08 5:33 ` Dave Jones 2005-12-08 5:38 ` David S. Miller ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Dave Jones @ 2005-12-08 5:33 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote: Hi Andi, > > Whilst debugging a memory leak, I hit sysrq meminfo, > > and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus > > > > I've only tried reproducing this on x86-64 so far. > > If the online map is wrong all kinds of things would go wrong. > > Most likely your kernel doesn't have the fix. This was seen with a .15rc5-git1 kernel. Is this something still living in your x86-64 patchset or -mm ? > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect > of CPU hotplug, which now uses a better algorithm to guess the > number of possible CPUs. In 2.6.15 you will just get half the number > of available CPUs in addition by default Yep, I noticed it offers a maximum of 6 cpus on my way. As a sidenote, seems kinda funny (and wasteful maybe?), doing this on a lot of hardware that isn't hotplug capable. (Whilst I could disable cpu hotplug in my local build, this isn't an answer for a generic distro kernel). Dave ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 5:33 ` Dave Jones @ 2005-12-08 5:38 ` David S. Miller 2005-12-08 6:12 ` Andi Kleen 2005-12-08 6:22 ` Nigel Cunningham 2005-12-08 6:30 ` Andi Kleen 2 siblings, 1 reply; 12+ messages in thread From: David S. Miller @ 2005-12-08 5:38 UTC (permalink / raw) To: davej; +Cc: ak, linux-kernel From: Dave Jones <davej@redhat.com> Date: Thu, 8 Dec 2005 00:33:02 -0500 > On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote: > > > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect > > of CPU hotplug, which now uses a better algorithm to guess the > > number of possible CPUs. In 2.6.15 you will just get half the number > > of available CPUs in addition by default > > Yep, I noticed it offers a maximum of 6 cpus on my way. > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > on a lot of hardware that isn't hotplug capable. (Whilst I could > disable cpu hotplug in my local build, this isn't an answer for > a generic distro kernel). This can be dangerous btw, as some subsystem such as netfilter allocate enormous datastructures based upon the largest possible cpu number in the system. In 2.6.16 it will use something a bit more intelligent, but overestimating the possible cpu set can be quite a waste. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 5:38 ` David S. Miller @ 2005-12-08 6:12 ` Andi Kleen 2005-12-08 6:27 ` Dave Jones 0 siblings, 1 reply; 12+ messages in thread From: Andi Kleen @ 2005-12-08 6:12 UTC (permalink / raw) To: David S. Miller; +Cc: davej, ak, linux-kernel On Wed, Dec 07, 2005 at 09:38:25PM -0800, David S. Miller wrote: > From: Dave Jones <davej@redhat.com> > Date: Thu, 8 Dec 2005 00:33:02 -0500 > > > On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote: > > > > > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect > > > of CPU hotplug, which now uses a better algorithm to guess the > > > number of possible CPUs. In 2.6.15 you will just get half the number > > > of available CPUs in addition by default > > > > Yep, I noticed it offers a maximum of 6 cpus on my way. > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > > on a lot of hardware that isn't hotplug capable. (Whilst I could > > disable cpu hotplug in my local build, this isn't an answer for > > a generic distro kernel). If you can figure out a way to detect this please share. The ACPI designers unfortunately didn't think that far (they did it right for memory hotplug, but not for CPU) I invented an ACPI extensin for it, but it's non standard so the half of CPUs is used as a default unless overwritten (additional_cpus=NUM) Anyways I changed it earlier to 1 additional CPU by default. > > This can be dangerous btw, as some subsystem such as netfilter > allocate enormous datastructures based upon the largest possible > cpu number in the system. It is a big improvement in fact. Previously with NR_CPUS==128 (default on some distros) and CPU_HOTPLUG enabled it would always allocate several hundred KB of just standard per cpu data unnecessarily. -Andi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 6:12 ` Andi Kleen @ 2005-12-08 6:27 ` Dave Jones 0 siblings, 0 replies; 12+ messages in thread From: Dave Jones @ 2005-12-08 6:27 UTC (permalink / raw) To: Andi Kleen; +Cc: David S. Miller, linux-kernel On Thu, Dec 08, 2005 at 07:12:12AM +0100, Andi Kleen wrote: > On Wed, Dec 07, 2005 at 09:38:25PM -0800, David S. Miller wrote: > > From: Dave Jones <davej@redhat.com> > > Date: Thu, 8 Dec 2005 00:33:02 -0500 > > > > > On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote: > > > > > > > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect > > > > of CPU hotplug, which now uses a better algorithm to guess the > > > > number of possible CPUs. In 2.6.15 you will just get half the number > > > > of available CPUs in addition by default > > > > > > Yep, I noticed it offers a maximum of 6 cpus on my way. > > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > > > on a lot of hardware that isn't hotplug capable. (Whilst I could > > > disable cpu hotplug in my local build, this isn't an answer for > > > a generic distro kernel). > > If you can figure out a way to detect this please share. > The ACPI designers unfortunately didn't think that far > (they did it right for memory hotplug, but not for CPU) > > I invented an ACPI extensin for it, but it's non standard > so the half of CPUs is used as a default unless overwritten > (additional_cpus=NUM) > > Anyways I changed it earlier to 1 additional CPU by default. Just guessing seems to be pretty guaranteed to give the wrong answer. I think it makes more sense to say "if your BIOS doesn't give the relevant info (as is usually the case), boot with additional_cpus) Penalising the many for the needs of the few just seems wrong. Dave ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 5:33 ` Dave Jones 2005-12-08 5:38 ` David S. Miller @ 2005-12-08 6:22 ` Nigel Cunningham 2005-12-08 6:28 ` Dave Jones 2005-12-08 6:30 ` Andi Kleen 2 siblings, 1 reply; 12+ messages in thread From: Nigel Cunningham @ 2005-12-08 6:22 UTC (permalink / raw) To: Dave Jones; +Cc: Andi Kleen, Linux Kernel Mailing List Hi. On Thu, 2005-12-08 at 15:33, Dave Jones wrote: > On Thu, Dec 08, 2005 at 06:26:32AM +0100, Andi Kleen wrote: > > Hi Andi, > > > > Whilst debugging a memory leak, I hit sysrq meminfo, > > > and got hot/cold info for CONFIG_NR_CPUS rather than 4 cpus > > > > > > I've only tried reproducing this on x86-64 so far. > > > > If the online map is wrong all kinds of things would go wrong. > > > > Most likely your kernel doesn't have the fix. > > This was seen with a .15rc5-git1 kernel. > Is this something still living in your x86-64 patchset or -mm ? > > > The possible map is fixed kind of BTW in 2.6.15rc*. It was a side effect > > of CPU hotplug, which now uses a better algorithm to guess the > > number of possible CPUs. In 2.6.15 you will just get half the number > > of available CPUs in addition by default > > Yep, I noticed it offers a maximum of 6 cpus on my way. > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > on a lot of hardware that isn't hotplug capable. (Whilst I could > disable cpu hotplug in my local build, this isn't an answer for > a generic distro kernel). Both suspend to disk (and suspend to ram?) implementations now depend on hotplug_cpu to enable extra cpus, so there is at least one reason for them to want hotplug support in a generic kernel. Regards, Nigel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 6:22 ` Nigel Cunningham @ 2005-12-08 6:28 ` Dave Jones 2005-12-08 6:30 ` Andi Kleen 2005-12-09 0:03 ` Nigel Cunningham 0 siblings, 2 replies; 12+ messages in thread From: Dave Jones @ 2005-12-08 6:28 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Andi Kleen, Linux Kernel Mailing List On Thu, Dec 08, 2005 at 04:22:05PM +1000, Nigel Cunningham wrote: > > Yep, I noticed it offers a maximum of 6 cpus on my way. > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > > on a lot of hardware that isn't hotplug capable. (Whilst I could > > disable cpu hotplug in my local build, this isn't an answer for > > a generic distro kernel). > > Both suspend to disk (and suspend to ram?) implementations now depend on > hotplug_cpu to enable extra cpus, so there is at least one reason for > them to want hotplug support in a generic kernel. You mean suspend -> plug in a new cpu -> resume transitions ? That sounds *terrifying*. Dave ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 6:28 ` Dave Jones @ 2005-12-08 6:30 ` Andi Kleen 2005-12-09 0:03 ` Nigel Cunningham 1 sibling, 0 replies; 12+ messages in thread From: Andi Kleen @ 2005-12-08 6:30 UTC (permalink / raw) To: Dave Jones, Nigel Cunningham, Andi Kleen, Linux Kernel Mailing List On Thu, Dec 08, 2005 at 01:28:44AM -0500, Dave Jones wrote: > On Thu, Dec 08, 2005 at 04:22:05PM +1000, Nigel Cunningham wrote: > > > > Yep, I noticed it offers a maximum of 6 cpus on my way. > > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > > > on a lot of hardware that isn't hotplug capable. (Whilst I could > > > disable cpu hotplug in my local build, this isn't an answer for > > > a generic distro kernel). > > > > Both suspend to disk (and suspend to ram?) implementations now depend on > > hotplug_cpu to enable extra cpus, so there is at least one reason for > > them to want hotplug support in a generic kernel. > > You mean suspend -> plug in a new cpu -> resume transitions ? > That sounds *terrifying*. No, they unplug all non BP CPUs before suspending. It obviously doesn't require additional CPUs though. -Andi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 6:28 ` Dave Jones 2005-12-08 6:30 ` Andi Kleen @ 2005-12-09 0:03 ` Nigel Cunningham 1 sibling, 0 replies; 12+ messages in thread From: Nigel Cunningham @ 2005-12-09 0:03 UTC (permalink / raw) To: Dave Jones; +Cc: Andi Kleen, Linux Kernel Mailing List Hi. On Thu, 2005-12-08 at 16:28, Dave Jones wrote: > On Thu, Dec 08, 2005 at 04:22:05PM +1000, Nigel Cunningham wrote: > > > > Yep, I noticed it offers a maximum of 6 cpus on my way. > > > As a sidenote, seems kinda funny (and wasteful maybe?), doing this > > > on a lot of hardware that isn't hotplug capable. (Whilst I could > > > disable cpu hotplug in my local build, this isn't an answer for > > > a generic distro kernel). > > > > Both suspend to disk (and suspend to ram?) implementations now depend on > > hotplug_cpu to enable extra cpus, so there is at least one reason for > > them to want hotplug support in a generic kernel. > > You mean suspend -> plug in a new cpu -> resume transitions ? > That sounds *terrifying*. Andi is right, it's just a logical unplug. But having said that, I suppose extra cpus could be plugged/unplugged during a suspend to disk. Not that I've ever tried it. I have a real SMP mobo, but haven't had the opportunity to fire it up. Regards, Nigel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 5:33 ` Dave Jones 2005-12-08 5:38 ` David S. Miller 2005-12-08 6:22 ` Nigel Cunningham @ 2005-12-08 6:30 ` Andi Kleen 2005-12-08 6:43 ` Dave Jones 2 siblings, 1 reply; 12+ messages in thread From: Andi Kleen @ 2005-12-08 6:30 UTC (permalink / raw) To: Dave Jones, Andi Kleen, linux-kernel > This was seen with a .15rc5-git1 kernel. > Is this something still living in your x86-64 patchset or -mm ? Only the change for less additional CPUs. -Andi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: for_each_online_cpu broken ? 2005-12-08 6:30 ` Andi Kleen @ 2005-12-08 6:43 ` Dave Jones 0 siblings, 0 replies; 12+ messages in thread From: Dave Jones @ 2005-12-08 6:43 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Thu, Dec 08, 2005 at 07:30:50AM +0100, Andi Kleen wrote: > > This was seen with a .15rc5-git1 kernel. > > Is this something still living in your x86-64 patchset or -mm ? > > Only the change for less additional CPUs. Ok, then I'm at a loss why it's dumping state for NR_CPUS cpus. I'm away tomorrow, but I'll poke at it some more on Friday. Dave ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-12-09 0:06 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-12-08 5:07 for_each_online_cpu broken ? Dave Jones 2005-12-08 5:26 ` Andi Kleen 2005-12-08 5:33 ` Dave Jones 2005-12-08 5:38 ` David S. Miller 2005-12-08 6:12 ` Andi Kleen 2005-12-08 6:27 ` Dave Jones 2005-12-08 6:22 ` Nigel Cunningham 2005-12-08 6:28 ` Dave Jones 2005-12-08 6:30 ` Andi Kleen 2005-12-09 0:03 ` Nigel Cunningham 2005-12-08 6:30 ` Andi Kleen 2005-12-08 6:43 ` Dave Jones
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox