All of lore.kernel.org
 help / color / mirror / Atom feed
* Bogus load average and cpu times on x86_64 SMP kernels
@ 2005-10-05  7:16 Brian Gerst
  2005-10-05 18:00 ` Brian Gerst
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Gerst @ 2005-10-05  7:16 UTC (permalink / raw)
  To: lkml; +Cc: Andi Kleen

I've been seeing bogus values from /proc/loadavg on an x86-64 SMP kernel 
(but not UP).

$ cat /proc/loadavg
-1012098.26 922203.26 -982431.60 1/112 2688

This is in the current git tree.  I'm also seeing strange values in 
/proc/stat:

cpu  2489 40 920 60530 9398 171 288 1844674407350
cpu0 2509 60 940 60550 9418 191 308 0

The first line is the sum of all cpus (I only have one), so it's picking 
up up bad data from the non-present cpus.  The last value, stolen time, 
is completely bogus since that value is only ever used on s390.

It looks to me like there is some problem with how the per-cpu 
structures are being initialized, or are getting corrupted.  I have not 
been able to test i386 SMP yet to see if the problem is x86_64 specific.

--
				Brian Gerst

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bogus load average and cpu times on x86_64 SMP kernels
  2005-10-05  7:16 Bogus load average and cpu times on x86_64 SMP kernels Brian Gerst
@ 2005-10-05 18:00 ` Brian Gerst
  2005-10-07  4:15   ` [PATCH] Fix hotplug cpu on x86_64 Brian Gerst
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Gerst @ 2005-10-05 18:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: lkml

Brian Gerst wrote:
> I've been seeing bogus values from /proc/loadavg on an x86-64 SMP kernel 
> (but not UP).
> 
> $ cat /proc/loadavg
> -1012098.26 922203.26 -982431.60 1/112 2688
> 
> This is in the current git tree.  I'm also seeing strange values in 
> /proc/stat:
> 
> cpu  2489 40 920 60530 9398 171 288 1844674407350
> cpu0 2509 60 940 60550 9418 191 308 0
> 
> The first line is the sum of all cpus (I only have one), so it's picking 
> up up bad data from the non-present cpus.  The last value, stolen time, 
> is completely bogus since that value is only ever used on s390.
> 
> It looks to me like there is some problem with how the per-cpu 
> structures are being initialized, or are getting corrupted.  I have not 
> been able to test i386 SMP yet to see if the problem is x86_64 specific.

I found the culprit: CPU hotplug.  The problem is that 
prefill_possible_map() is called after setup_per_cpu_areas().  This 
leaves the per-cpu data sections for the future cpus uninitialized 
(still pointing to the original per-cpu data, which is initmem).  Since 
the cpus exists in cpu_possible_map, for_each_cpu will iterate over them 
even though the per-cpu data is invalid.

--
				Brian Gerst

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Fix hotplug cpu on x86_64
  2005-10-05 18:00 ` Brian Gerst
@ 2005-10-07  4:15   ` Brian Gerst
  2005-10-07  9:50     ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Gerst @ 2005-10-07  4:15 UTC (permalink / raw)
  To: lkml; +Cc: Andrew Morton, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]

Brian Gerst wrote:
> Brian Gerst wrote:
>> I've been seeing bogus values from /proc/loadavg on an x86-64 SMP 
>> kernel (but not UP).
>>
>> $ cat /proc/loadavg
>> -1012098.26 922203.26 -982431.60 1/112 2688
>>
>> This is in the current git tree.  I'm also seeing strange values in 
>> /proc/stat:
>>
>> cpu  2489 40 920 60530 9398 171 288 1844674407350
>> cpu0 2509 60 940 60550 9418 191 308 0
>>
>> The first line is the sum of all cpus (I only have one), so it's 
>> picking up up bad data from the non-present cpus.  The last value, 
>> stolen time, is completely bogus since that value is only ever used on 
>> s390.
>>
>> It looks to me like there is some problem with how the per-cpu 
>> structures are being initialized, or are getting corrupted.  I have 
>> not been able to test i386 SMP yet to see if the problem is x86_64 
>> specific.
> 
> I found the culprit: CPU hotplug.  The problem is that 
> prefill_possible_map() is called after setup_per_cpu_areas().  This 
> leaves the per-cpu data sections for the future cpus uninitialized 
> (still pointing to the original per-cpu data, which is initmem).  Since 
> the cpus exists in cpu_possible_map, for_each_cpu will iterate over them 
> even though the per-cpu data is invalid.

Initialize cpu_possible_map properly in the hotplug cpu case.  Before 
this, the map was filled in after the per-cpu areas were set up.  This 
left those cpus pointing to initmem and causing invalid data in 
/proc/stat and elsewhere.

Signed-off-by: Brian Gerst <bgerst@didntduck.org>

[-- Attachment #2: 0001-Fix-hotplug-cpu-on-x86_64.txt --]
[-- Type: text/plain, Size: 2759 bytes --]

Subject: [PATCH] Fix hotplug cpu on x86_64

Initialize cpu_possible_map properly in the hotplug cpu case.  Before
this, the map was filled in after the per-cpu areas were set up.  This left
those cpus pointing to initmem and causing invalid data in /proc/stat and
elsewhere.

Signed-off-by: Brian Gerst <bgerst@didntduck.org>

---

 arch/x86_64/kernel/smpboot.c |   41 ++++++++++++++++-------------------------
 1 files changed, 16 insertions(+), 25 deletions(-)

12027fcffd85447f0fbf28264c5ee072715c345b
diff --git a/arch/x86_64/kernel/smpboot.c b/arch/x86_64/kernel/smpboot.c
--- a/arch/x86_64/kernel/smpboot.c
+++ b/arch/x86_64/kernel/smpboot.c
@@ -80,7 +80,23 @@ EXPORT_SYMBOL(cpu_online_map);
 cpumask_t cpu_callin_map;
 cpumask_t cpu_callout_map;
 
+#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * cpu_possible_map should be static, it cannot change as cpu's
+ * are onlined, or offlined. The reason is per-cpu data-structures
+ * are allocated by some modules at init time, and dont expect to
+ * do this dynamically on cpu arrival/departure.
+ * cpu_present_map on the other hand can change dynamically.
+ * In case when cpu_hotplug is not compiled, then we resort to current
+ * behaviour, which is cpu_possible == cpu_present.
+ * If cpu-hotplug is supported, then we need to preallocate for all
+ * those NR_CPUS, hence cpu_possible_map represents entire NR_CPUS range.
+ * - Ashok Raj
+ */
+cpumask_t cpu_possible_map = CPU_MASK_ALL;
+#else
 cpumask_t cpu_possible_map;
+#endif
 EXPORT_SYMBOL(cpu_possible_map);
 
 /* Per CPU bogomips and other parameters */
@@ -879,27 +895,6 @@ static __init void disable_smp(void)
 	cpu_set(0, cpu_core_map[0]);
 }
 
-#ifdef CONFIG_HOTPLUG_CPU
-/*
- * cpu_possible_map should be static, it cannot change as cpu's
- * are onlined, or offlined. The reason is per-cpu data-structures
- * are allocated by some modules at init time, and dont expect to
- * do this dynamically on cpu arrival/departure.
- * cpu_present_map on the other hand can change dynamically.
- * In case when cpu_hotplug is not compiled, then we resort to current
- * behaviour, which is cpu_possible == cpu_present.
- * If cpu-hotplug is supported, then we need to preallocate for all
- * those NR_CPUS, hence cpu_possible_map represents entire NR_CPUS range.
- * - Ashok Raj
- */
-static void prefill_possible_map(void)
-{
-	int i;
-	for (i = 0; i < NR_CPUS; i++)
-		cpu_set(i, cpu_possible_map);
-}
-#endif
-
 /*
  * Various sanity checks.
  */
@@ -967,10 +962,6 @@ void __init smp_prepare_cpus(unsigned in
 	current_cpu_data = boot_cpu_data;
 	current_thread_info()->cpu = 0;  /* needed? */
 
-#ifdef CONFIG_HOTPLUG_CPU
-	prefill_possible_map();
-#endif
-
 	if (smp_sanity_check(max_cpus) < 0) {
 		printk(KERN_INFO "SMP disabled\n");
 		disable_smp();

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix hotplug cpu on x86_64
  2005-10-07  4:15   ` [PATCH] Fix hotplug cpu on x86_64 Brian Gerst
@ 2005-10-07  9:50     ` Andi Kleen
  2005-10-08  0:50       ` Anton Blanchard
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2005-10-07  9:50 UTC (permalink / raw)
  To: Brian Gerst; +Cc: lkml, Andrew Morton, Andi Kleen

A similar patch is already queued with Linus.

I also have a followon patch to avoid the extreme memory wastage
currently caused by hotplug CPUs (e.g. with NR_CPUS==128 you currently
lose 4MB of memory just for preallocated per CPU data). But that is
something for post 2.6.14.

See ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/* 
for the current queue.

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix hotplug cpu on x86_64
  2005-10-07  9:50     ` Andi Kleen
@ 2005-10-08  0:50       ` Anton Blanchard
  2005-10-08 10:33         ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Anton Blanchard @ 2005-10-08  0:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Brian Gerst, lkml, Andrew Morton


Hi,

> I also have a followon patch to avoid the extreme memory wastage
> currently caused by hotplug CPUs (e.g. with NR_CPUS==128 you currently
> lose 4MB of memory just for preallocated per CPU data). But that is
> something for post 2.6.14.

Im interested in doing that on ppc64 too. Are you currently only
creating per cpu data areas for possible cpus? The generic code does
NR_CPUS worth, we should change that in 2.6.15.

Anton

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix hotplug cpu on x86_64
  2005-10-08  0:50       ` Anton Blanchard
@ 2005-10-08 10:33         ` Andi Kleen
  0 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2005-10-08 10:33 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: Brian Gerst, lkml, Andrew Morton

On Saturday 08 October 2005 02:50, Anton Blanchard wrote:
> Hi,
>
> > I also have a followon patch to avoid the extreme memory wastage
> > currently caused by hotplug CPUs (e.g. with NR_CPUS==128 you currently
> > lose 4MB of memory just for preallocated per CPU data). But that is
> > something for post 2.6.14.
>
> Im interested in doing that on ppc64 too. Are you currently only
> creating per cpu data areas for possible cpus? 

Yes. In fact that caused some of the problems that lead to the investigation 
of this.

> The generic code does 
> NR_CPUS worth, we should change that in 2.6.15.

You need to audit all architecture code then that fills up possible map.
(at least for the architectures that support CPU hotplug) 

With that change x86-64 could move back to the generic code BTW - 
the numa allocation in the architecture specific code doesn't work anymore 
because we usually find out about the cpu<->node mapping now only 
afterwards :/

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-10-08 10:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-05  7:16 Bogus load average and cpu times on x86_64 SMP kernels Brian Gerst
2005-10-05 18:00 ` Brian Gerst
2005-10-07  4:15   ` [PATCH] Fix hotplug cpu on x86_64 Brian Gerst
2005-10-07  9:50     ` Andi Kleen
2005-10-08  0:50       ` Anton Blanchard
2005-10-08 10:33         ` Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.