* calibrate_migration_costs takes ages on s390
@ 2006-02-13 10:26 Heiko Carstens
2006-02-13 10:34 ` David S. Miller
2006-02-13 10:46 ` Ingo Molnar
0 siblings, 2 replies; 13+ messages in thread
From: Heiko Carstens @ 2006-02-13 10:26 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Hannes Reinecke, linux-kernel
The boot sequence on s390 sometimes takes ages and we spend a very long time
(up to one or two minutes) in calibrate_migration_costs. The time spent there
differs from boot to boot. Also the calculated costs differ a lot. I've seen
differences by up to a factor of 15 (yes, factor not percent).
Also I doubt that making these measurements make much sense on a completely
virtualized architecture where you cannot tell how much cpu time you will
get anyway.
Is there any workaround or fix available so we can avoid seeing this?
Thanks,
Heiko
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:26 calibrate_migration_costs takes ages on s390 Heiko Carstens
@ 2006-02-13 10:34 ` David S. Miller
2006-02-13 10:54 ` Ingo Molnar
2006-02-13 10:46 ` Ingo Molnar
1 sibling, 1 reply; 13+ messages in thread
From: David S. Miller @ 2006-02-13 10:34 UTC (permalink / raw)
To: heiko.carstens; +Cc: mingo, hare, linux-kernel
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Mon, 13 Feb 2006 11:26:34 +0100
> The boot sequence on s390 sometimes takes ages and we spend a very long time
> (up to one or two minutes) in calibrate_migration_costs. The time spent there
> differs from boot to boot. Also the calculated costs differ a lot. I've seen
> differences by up to a factor of 15 (yes, factor not percent).
> Also I doubt that making these measurements make much sense on a completely
> virtualized architecture where you cannot tell how much cpu time you will
> get anyway.
> Is there any workaround or fix available so we can avoid seeing this?
Things are not as slow, but definitely slow on sparc64 too, and it's
also due to the migration cost calculations.
It's also really bad that it's using vmalloc(), for one thing, because
this thrashes the TLB (some of us have 64-entry software replaced
TLBs) and also because you can make no guarentees about how well the
backing physical pages will distribute into the L2 cache.
As a result, wildly different run-to-run results can be expected
particularly for systems with 1-way or 2-way set assosciative L2
caches, which are common on sparc64. I don't know about s390.
I think the migration cost calculator is way overboard and needs to be
toned down a little bit.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:26 calibrate_migration_costs takes ages on s390 Heiko Carstens
2006-02-13 10:34 ` David S. Miller
@ 2006-02-13 10:46 ` Ingo Molnar
2006-02-13 16:13 ` Heiko Carstens
` (2 more replies)
1 sibling, 3 replies; 13+ messages in thread
From: Ingo Molnar @ 2006-02-13 10:46 UTC (permalink / raw)
To: Heiko Carstens; +Cc: Hannes Reinecke, linux-kernel, Andrew Morton
* Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> The boot sequence on s390 sometimes takes ages and we spend a very
> long time (up to one or two minutes) in calibrate_migration_costs. The
> time spent there differs from boot to boot. Also the calculated costs
> differ a lot. I've seen differences by up to a factor of 15 (yes,
> factor not percent). Also I doubt that making these measurements make
> much sense on a completely virtualized architecture where you cannot
> tell how much cpu time you will get anyway. Is there any workaround or
> fix available so we can avoid seeing this?
which is the precise kernel version used? We toned down calibration a
bit recently.
The immediate workaround would be to use the migration_cost=0 boot
parameter.
Generally, i agree that it makes sense to not calibrate at all on
virtual platforms. Does the patch below help? It gives virtual
platforms a way to provide a default migration cost and thus avoid the
boot-time calibration altogether. (I have tested it on x86, it does the
expected thing.) This needs to hit v2.6.16 too.
Ingo
---------
introduce the CONFIG_DEFAULT_MIGRATION_COST method for an architecture
to set the scheduler migration costs. This turns off automatic detection
of migration costs. Makes sense on virtual platforms, where migration
costs are hard to measure accurately.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
----
arch/s390/Kconfig | 4 ++++
kernel/sched.c | 13 ++++++++++++-
2 files changed, 16 insertions(+), 1 deletion(-)
Index: linux-robust-list.q/arch/s390/Kconfig
===================================================================
--- linux-robust-list.q.orig/arch/s390/Kconfig
+++ linux-robust-list.q/arch/s390/Kconfig
@@ -80,6 +80,10 @@ config HOTPLUG_CPU
can be controlled through /sys/devices/system/cpu/cpu#.
Say N if you want to disable CPU hotplug.
+config DEFAULT_MIGRATION_COST
+ int
+ default "1000000"
+
config MATHEMU
bool "IEEE FPU emulation"
depends on MARCH_G5
Index: linux-robust-list.q/kernel/sched.c
===================================================================
--- linux-robust-list.q.orig/kernel/sched.c
+++ linux-robust-list.q/kernel/sched.c
@@ -5159,7 +5159,18 @@ static void init_sched_build_groups(stru
#define MAX_DOMAIN_DISTANCE 32
static unsigned long long migration_cost[MAX_DOMAIN_DISTANCE] =
- { [ 0 ... MAX_DOMAIN_DISTANCE-1 ] = -1LL };
+ { [ 0 ... MAX_DOMAIN_DISTANCE-1 ] =
+/*
+ * Architectures may override the migration cost and thus avoid
+ * boot-time calibration. Unit is nanoseconds. Mostly useful for
+ * virtualized hardware:
+ */
+#ifdef CONFIG_DEFAULT_MIGRATION_COST
+ CONFIG_DEFAULT_MIGRATION_COST
+#else
+ -1LL
+#endif
+};
/*
* Allow override of migration cost - in units of microseconds.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:34 ` David S. Miller
@ 2006-02-13 10:54 ` Ingo Molnar
2006-02-13 20:57 ` David S. Miller
0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2006-02-13 10:54 UTC (permalink / raw)
To: David S. Miller; +Cc: heiko.carstens, hare, linux-kernel
* David S. Miller <davem@davemloft.net> wrote:
> Things are not as slow, but definitely slow on sparc64 too, and it's
> also due to the migration cost calculations.
>
> It's also really bad that it's using vmalloc(), for one thing, because
> this thrashes the TLB (some of us have 64-entry software replaced
> TLBs) and also because you can make no guarentees about how well the
> backing physical pages will distribute into the L2 cache.
the TLB trashing is intended, to calculate the worst-case migration
cost. If userspace is TLB-intensive, it will trash TLBs just as much.
> As a result, wildly different run-to-run results can be expected
> particularly for systems with 1-way or 2-way set assosciative L2
> caches, which are common on sparc64. I don't know about s390.
s390 is clearly a special-base, being a virtual platform. But the
calibration should be improved to work better on sparc64.
Do things get better if you fill out include/asm-sparc64/system.h's
sched_cacheflush() function, to flush the L2 cache? That should at least
make the cache state more or less reproducable across runs.
Ingo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:46 ` Ingo Molnar
@ 2006-02-13 16:13 ` Heiko Carstens
2006-02-13 23:42 ` Olaf Hering
2006-02-16 6:27 ` Heiko Carstens
2 siblings, 0 replies; 13+ messages in thread
From: Heiko Carstens @ 2006-02-13 16:13 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Hannes Reinecke, linux-kernel, Andrew Morton
> > The boot sequence on s390 sometimes takes ages and we spend a very
> > long time (up to one or two minutes) in calibrate_migration_costs. The
> > time spent there differs from boot to boot. Also the calculated costs
> > differ a lot. I've seen differences by up to a factor of 15 (yes,
> > factor not percent). Also I doubt that making these measurements make
> > much sense on a completely virtualized architecture where you cannot
> > tell how much cpu time you will get anyway. Is there any workaround or
> > fix available so we can avoid seeing this?
>
> which is the precise kernel version used? We toned down calibration a
> bit recently.
2.6.16-rc3.
> The immediate workaround would be to use the migration_cost=0 boot
> parameter.
>
> Generally, i agree that it makes sense to not calibrate at all on
> virtual platforms. Does the patch below help? It gives virtual
> platforms a way to provide a default migration cost and thus avoid the
> boot-time calibration altogether. (I have tested it on x86, it does the
> expected thing.) This needs to hit v2.6.16 too.
Yes, calibrate_migration_costs is very fast now. But it turned out that
this was just hiding the real problem: if we have CONFIG_PREEMPT disabled
the kernel gets (sometimes) unbelievably slow.
I think this happened somewhere between rc1 and rc3. Maybe Hannes knows
more exactly when this happened the first time, since I always run with
CONFIG_PREEMPT enabled.
Heiko
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:54 ` Ingo Molnar
@ 2006-02-13 20:57 ` David S. Miller
0 siblings, 0 replies; 13+ messages in thread
From: David S. Miller @ 2006-02-13 20:57 UTC (permalink / raw)
To: mingo; +Cc: heiko.carstens, hare, linux-kernel
From: Ingo Molnar <mingo@elte.hu>
Date: Mon, 13 Feb 2006 11:54:21 +0100
> Do things get better if you fill out include/asm-sparc64/system.h's
> sched_cacheflush() function, to flush the L2 cache? That should at least
> make the cache state more or less reproducable across runs.
Yes, I tried to implement that, and it makes the migration cost
calculation take 4 to 5 times longer, so I'm leaving it unimplemented.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:46 ` Ingo Molnar
2006-02-13 16:13 ` Heiko Carstens
@ 2006-02-13 23:42 ` Olaf Hering
2006-02-14 0:08 ` Olaf Hering
2006-02-16 6:27 ` Heiko Carstens
2 siblings, 1 reply; 13+ messages in thread
From: Olaf Hering @ 2006-02-13 23:42 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Heiko Carstens, Hannes Reinecke, linux-kernel, Andrew Morton
On Mon, Feb 13, Ingo Molnar wrote:
>
> * Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
>
> > The boot sequence on s390 sometimes takes ages and we spend a very
> > long time (up to one or two minutes) in calibrate_migration_costs. The
> > time spent there differs from boot to boot. Also the calculated costs
> > differ a lot. I've seen differences by up to a factor of 15 (yes,
> > factor not percent). Also I doubt that making these measurements make
> > much sense on a completely virtualized architecture where you cannot
> > tell how much cpu time you will get anyway. Is there any workaround or
> > fix available so we can avoid seeing this?
>
> which is the precise kernel version used? We toned down calibration a
> bit recently.
We did a bit of testing, -rc2-git3 + the patch below was still ok.
[PATCH] s390: earlier initialization of cpu_possible_map
9733e2407ad2237867cb13c04e7d619397fa3090
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 23:42 ` Olaf Hering
@ 2006-02-14 0:08 ` Olaf Hering
2006-02-14 8:09 ` Heiko Carstens
0 siblings, 1 reply; 13+ messages in thread
From: Olaf Hering @ 2006-02-14 0:08 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Heiko Carstens, Hannes Reinecke, linux-kernel, Andrew Morton
On Tue, Feb 14, Olaf Hering wrote:
> We did a bit of testing, -rc2-git3 + the patch below was still ok.
>
> [PATCH] s390: earlier initialization of cpu_possible_map
> 9733e2407ad2237867cb13c04e7d619397fa3090
I need to double check, but -git5 + that patch was reported to be slow.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-14 0:08 ` Olaf Hering
@ 2006-02-14 8:09 ` Heiko Carstens
2006-02-14 10:56 ` Heiko Carstens
0 siblings, 1 reply; 13+ messages in thread
From: Heiko Carstens @ 2006-02-14 8:09 UTC (permalink / raw)
To: Olaf Hering; +Cc: Ingo Molnar, Hannes Reinecke, linux-kernel, Andrew Morton
> > We did a bit of testing, -rc2-git3 + the patch below was still ok.
> >
> > [PATCH] s390: earlier initialization of cpu_possible_map
> > 9733e2407ad2237867cb13c04e7d619397fa3090
>
> I need to double check, but -git5 + that patch was reported to be slow.
I did a quick git bisect search. This is one is the hurting one:
Author: Ingo Molnar <mingo@elte.hu> 2006-02-07 21:58:54
Committer: Linus Torvalds <torvalds@g5.osdl.org> 2006-02-08 01:12:33
Parent: 8519fb30e438f8088b71a94a7d5a660a814d3872 ([PATCH] mm: compound release fix)
Child: 0d4c3e7a8c65892c7d6a748fdbb4499e988880db ([PATCH] unshare system call -v5: Documentation file)
[PATCH] Fix spinlock debugging delays to not time out too early
The spinlock-debug wait-loop was using loops_per_jiffy to detect too long
spinlock waits - but on fast CPUs this led to a way too fast timeout and false
messages.
The fix is to include a __delay(1) call in the loop, to correctly approximate
the intended delay timeout of 1 second. The code assumes that every
architecture implements __delay(1) to last around 1/(loops_per_jiffy*HZ)
seconds.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I guess we're once again suffering from being a virtualized platform: the
formerly used call to cpu_relax() informed the underlying hypervisor that
we want to give up the current cpu while __delay() keeps it.
Unless we're scheduled away involuntarily.
The "Detect Soft Lockups" option doesn't make too much sense too on our
platform, since we get a lot of false positives.
Quick fix: turn off the options CONFIG_DEBUG_SPINLOCK and
CONFIG_DETECT_SOFTLOCKUP.
Heiko
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-14 8:09 ` Heiko Carstens
@ 2006-02-14 10:56 ` Heiko Carstens
2006-02-14 12:35 ` Ingo Molnar
0 siblings, 1 reply; 13+ messages in thread
From: Heiko Carstens @ 2006-02-14 10:56 UTC (permalink / raw)
To: Olaf Hering; +Cc: Ingo Molnar, Hannes Reinecke, linux-kernel, Andrew Morton
> I did a quick git bisect search. This is one is the hurting one:
>
> Author: Ingo Molnar <mingo@elte.hu> 2006-02-07 21:58:54
> Committer: Linus Torvalds <torvalds@g5.osdl.org> 2006-02-08 01:12:33
> Parent: 8519fb30e438f8088b71a94a7d5a660a814d3872 ([PATCH] mm: compound release fix)
> Child: 0d4c3e7a8c65892c7d6a748fdbb4499e988880db ([PATCH] unshare system call -v5: Documentation file)
>
> The fix is to include a __delay(1) call in the loop, to correctly approximate
> the intended delay timeout of 1 second. The code assumes that every
> architecture implements __delay(1) to last around 1/(loops_per_jiffy*HZ)
> seconds.
>
> I guess we're once again suffering from being a virtualized platform: the
> formerly used call to cpu_relax() informed the underlying hypervisor that
> we want to give up the current cpu while __delay() keeps it.
> Unless we're scheduled away involuntarily.
> The "Detect Soft Lockups" option doesn't make too much sense too on our
> platform, since we get a lot of false positives.
> Quick fix: turn off the options CONFIG_DEBUG_SPINLOCK and
> CONFIG_DETECT_SOFTLOCKUP.
Wrong analysis. Our __delay() implementation is broken. This doesn't help for
the CONFIG_DETECT_SOFTLOCKUP case, but at least CONFIG_DEBUG_SPINLOCK works
again with this.
Andrew, could you pick this one up, or should I send it separately?
[PATCH] s390: fix __delay implementation
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Fix __delay implementation. Called with an argument "1" or "0" it
would loop nearly forever (since (1/2)-1 = 0xffffffff).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
arch/s390/lib/delay.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/lib/delay.c b/arch/s390/lib/delay.c
index e96c35b..71f0a2f 100644
--- a/arch/s390/lib/delay.c
+++ b/arch/s390/lib/delay.c
@@ -30,7 +30,7 @@ void __delay(unsigned long loops)
*/
__asm__ __volatile__(
"0: brct %0,0b"
- : /* no outputs */ : "r" (loops/2) );
+ : /* no outputs */ : "r" ((loops/2) + 1));
}
/*
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-14 10:56 ` Heiko Carstens
@ 2006-02-14 12:35 ` Ingo Molnar
2006-02-14 12:37 ` Ingo Molnar
0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2006-02-14 12:35 UTC (permalink / raw)
To: Heiko Carstens; +Cc: Olaf Hering, Hannes Reinecke, linux-kernel, Andrew Morton
* Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> --- a/arch/s390/lib/delay.c
> +++ b/arch/s390/lib/delay.c
> @@ -30,7 +30,7 @@ void __delay(unsigned long loops)
> */
> __asm__ __volatile__(
> "0: brct %0,0b"
> - : /* no outputs */ : "r" (loops/2) );
> + : /* no outputs */ : "r" ((loops/2) + 1));
> }
ahh ... that explains the delays indeed!
Ingo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-14 12:35 ` Ingo Molnar
@ 2006-02-14 12:37 ` Ingo Molnar
0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2006-02-14 12:37 UTC (permalink / raw)
To: Heiko Carstens; +Cc: Olaf Hering, Hannes Reinecke, linux-kernel, Andrew Morton
* Ingo Molnar <mingo@elte.hu> wrote:
> * Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
>
> > --- a/arch/s390/lib/delay.c
> > +++ b/arch/s390/lib/delay.c
> > @@ -30,7 +30,7 @@ void __delay(unsigned long loops)
> > */
> > __asm__ __volatile__(
> > "0: brct %0,0b"
> > - : /* no outputs */ : "r" (loops/2) );
> > + : /* no outputs */ : "r" ((loops/2) + 1));
> > }
>
> ahh ... that explains the delays indeed!
just to make sure, i've checked all the other __delay() implementations
in the kernel, and none seems to have such problems.
Ingo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: calibrate_migration_costs takes ages on s390
2006-02-13 10:46 ` Ingo Molnar
2006-02-13 16:13 ` Heiko Carstens
2006-02-13 23:42 ` Olaf Hering
@ 2006-02-16 6:27 ` Heiko Carstens
2 siblings, 0 replies; 13+ messages in thread
From: Heiko Carstens @ 2006-02-16 6:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: Hannes Reinecke, linux-kernel, Ingo Molnar
> introduce the CONFIG_DEFAULT_MIGRATION_COST method for an architecture
> to set the scheduler migration costs. This turns off automatic detection
> of migration costs. Makes sense on virtual platforms, where migration
> costs are hard to measure accurately.
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>
> ----
>
> arch/s390/Kconfig | 4 ++++
> kernel/sched.c | 13 ++++++++++++-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> Index: linux-robust-list.q/arch/s390/Kconfig
> ===================================================================
> --- linux-robust-list.q.orig/arch/s390/Kconfig
> +++ linux-robust-list.q/arch/s390/Kconfig
> @@ -80,6 +80,10 @@ config HOTPLUG_CPU
> can be controlled through /sys/devices/system/cpu/cpu#.
> Say N if you want to disable CPU hotplug.
>
> +config DEFAULT_MIGRATION_COST
> + int
> + default "1000000"
> +
> config MATHEMU
> bool "IEEE FPU emulation"
> depends on MARCH_G5
> Index: linux-robust-list.q/kernel/sched.c
> ===================================================================
> --- linux-robust-list.q.orig/kernel/sched.c
> +++ linux-robust-list.q/kernel/sched.c
> @@ -5159,7 +5159,18 @@ static void init_sched_build_groups(stru
> #define MAX_DOMAIN_DISTANCE 32
>
> static unsigned long long migration_cost[MAX_DOMAIN_DISTANCE] =
> - { [ 0 ... MAX_DOMAIN_DISTANCE-1 ] = -1LL };
> + { [ 0 ... MAX_DOMAIN_DISTANCE-1 ] =
> +/*
> + * Architectures may override the migration cost and thus avoid
> + * boot-time calibration. Unit is nanoseconds. Mostly useful for
> + * virtualized hardware:
> + */
> +#ifdef CONFIG_DEFAULT_MIGRATION_COST
> + CONFIG_DEFAULT_MIGRATION_COST
> +#else
> + -1LL
> +#endif
> +};
>
> /*
> * Allow override of migration cost - in units of microseconds.
> -
This one should be applied then.
Thanks,
Heiko
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-02-16 6:27 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-13 10:26 calibrate_migration_costs takes ages on s390 Heiko Carstens
2006-02-13 10:34 ` David S. Miller
2006-02-13 10:54 ` Ingo Molnar
2006-02-13 20:57 ` David S. Miller
2006-02-13 10:46 ` Ingo Molnar
2006-02-13 16:13 ` Heiko Carstens
2006-02-13 23:42 ` Olaf Hering
2006-02-14 0:08 ` Olaf Hering
2006-02-14 8:09 ` Heiko Carstens
2006-02-14 10:56 ` Heiko Carstens
2006-02-14 12:35 ` Ingo Molnar
2006-02-14 12:37 ` Ingo Molnar
2006-02-16 6:27 ` Heiko Carstens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox