* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
@ 2011-01-20 9:42 Santosh Shilimkar
2011-01-20 15:14 ` Rob Herring
0 siblings, 1 reply; 13+ messages in thread
From: Santosh Shilimkar @ 2011-01-20 9:42 UTC (permalink / raw)
To: linux-arm-kernel
On some architectures, secondary cores shares clock with primiary
core and hence scale together. Hence secondary core lpj calibration
is not necessary and can be skipped to save considerable time.
This can speed up the secondary cpu boot and hotplug cpu online
paths.
Discussion thread:
http://www.spinics.net/lists/arm-kernel/msg111124.html
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Linus Walleij <linus.walleij@stericsson.com>
---
arch/arm/Kconfig | 5 +++++
arch/arm/kernel/smp.c | 36 ++++++++++++++++++++++++++++--------
arch/arm/mach-omap2/Kconfig | 1 +
3 files changed, 34 insertions(+), 8 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 5cff165..8944639 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -823,6 +823,7 @@ config ARCH_U300
config ARCH_U8500
bool "ST-Ericsson U8500 Series"
select CPU_V7
+ select ARCH_HAS_COMMON_CORES_CLOCK
select ARM_AMBA
select GENERIC_CLOCKEVENTS
select CLKDEV_LOOKUP
@@ -1418,6 +1419,10 @@ config ARCH_SPARSEMEM_DEFAULT
config ARCH_SELECT_MEMORY_MODEL
def_bool ARCH_SPARSEMEM_ENABLE
+config ARCH_HAS_COMMON_CORES_CLOCK
+ bool
+ depends on SMP
+
config HIGHMEM
bool "High Memory Support (EXPERIMENTAL)"
depends on MMU && EXPERIMENTAL
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 4539ebc..8e72b11 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -270,6 +270,20 @@ static void __cpuinit smp_store_cpu_info(unsigned int cpuid)
}
/*
+ * Skip the secondary calibration on architectures sharing clock
+ * with primary cpu. Archs can select ARCH_HAS_COMMON_CORES_CLOCK
+ * for this.
+ */
+static inline int skip_secondary_calibrate(void)
+{
+#ifdef CONFIG_ARCH_HAS_COMMON_CORES_CLOCK
+ return true;
+#else
+ return false;
+#endif
+}
+
+/*
* This is the secondary CPU boot entry. We're using this CPUs
* idle thread stack, but a set of temporary page tables.
*/
@@ -312,7 +326,8 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
*/
percpu_timer_setup();
- calibrate_delay();
+ if (!skip_secondary_calibrate())
+ calibrate_delay();
smp_store_cpu_info(cpu);
@@ -332,14 +347,19 @@ void __init smp_cpus_done(unsigned int max_cpus)
int cpu;
unsigned long bogosum = 0;
- for_each_online_cpu(cpu)
- bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
+ if (!skip_secondary_calibrate()) {
+ for_each_online_cpu(cpu)
+ bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
- printk(KERN_INFO "SMP: Total of %d processors activated "
- "(%lu.%02lu BogoMIPS).\n",
- num_online_cpus(),
- bogosum / (500000/HZ),
- (bogosum / (5000/HZ)) % 100);
+ printk(KERN_INFO "SMP: Total of %d processors activated "
+ "(%lu.%02lu BogoMIPS).\n",
+ num_online_cpus(),
+ bogosum / (500000/HZ),
+ (bogosum / (5000/HZ)) % 100);
+ } else {
+ printk(KERN_INFO "SMP: Total of %d processors activated.\n",
+ num_online_cpus());
+ }
}
void __init smp_prepare_boot_cpu(void)
diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index 1a2cf62..e6ad15d 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -43,6 +43,7 @@ config ARCH_OMAP4
default y
depends on ARCH_OMAP2PLUS
select CPU_V7
+ select ARCH_HAS_COMMON_CORES_CLOCK
select ARM_GIC
select PL310_ERRATA_588369
select ARM_ERRATA_720789
--
1.6.0.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-20 9:42 [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot Santosh Shilimkar
@ 2011-01-20 15:14 ` Rob Herring
2011-01-20 15:36 ` Santosh Shilimkar
2011-01-20 16:21 ` Linus Walleij
0 siblings, 2 replies; 13+ messages in thread
From: Rob Herring @ 2011-01-20 15:14 UTC (permalink / raw)
To: linux-arm-kernel
On 01/20/2011 03:42 AM, Santosh Shilimkar wrote:
> On some architectures, secondary cores shares clock with primiary
> core and hence scale together. Hence secondary core lpj calibration
> is not necessary and can be skipped to save considerable time.
>
> This can speed up the secondary cpu boot and hotplug cpu online
> paths.
>
> Discussion thread:
> http://www.spinics.net/lists/arm-kernel/msg111124.html
>
There's already one way to do this with pre-calculated lpj.
Also, this isn't multi-platform friendly. You could accomplish the same
thing using the clock api to get the core frequency of each core and
only calculate lpj if the frequency is different.
Rob
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-20 15:14 ` Rob Herring
@ 2011-01-20 15:36 ` Santosh Shilimkar
2011-01-20 16:34 ` Rob Herring
2011-01-20 16:21 ` Linus Walleij
1 sibling, 1 reply; 13+ messages in thread
From: Santosh Shilimkar @ 2011-01-20 15:36 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: Rob Herring [mailto:robherring2 at gmail.com]
> Sent: Thursday, January 20, 2011 8:44 PM
> To: Santosh Shilimkar
> Cc: linux-arm-kernel at lists.infradead.org; Russell King; linux-
> omap at vger.kernel.org; Linus Walleij
> Subject: Re: [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK
> to speed-up boot
>
> On 01/20/2011 03:42 AM, Santosh Shilimkar wrote:
> > On some architectures, secondary cores shares clock with primiary
> > core and hence scale together. Hence secondary core lpj
> calibration
> > is not necessary and can be skipped to save considerable time.
> >
> > This can speed up the secondary cpu boot and hotplug cpu online
> > paths.
> >
> > Discussion thread:
> > http://www.spinics.net/lists/arm-kernel/msg111124.html
> >
>
> There's already one way to do this with pre-calculated lpj.
>
How about the hot-plug path? This is not for just boot.
> Also, this isn't multi-platform friendly. You could accomplish the
> same
> thing using the clock api to get the core frequency of each core and
> only calculate lpj if the frequency is different.
May be but what's wrong with the obvious approach which is
completely non-intrusive.
Why is not multi-platform friendly ?
Archs can choose not to select this option.
Regards,
Santosh
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-20 15:36 ` Santosh Shilimkar
@ 2011-01-20 16:34 ` Rob Herring
2011-01-21 13:43 ` Santosh Shilimkar
0 siblings, 1 reply; 13+ messages in thread
From: Rob Herring @ 2011-01-20 16:34 UTC (permalink / raw)
To: linux-arm-kernel
On 01/20/2011 09:36 AM, Santosh Shilimkar wrote:
>> -----Original Message-----
>> From: Rob Herring [mailto:robherring2 at gmail.com]
>> Sent: Thursday, January 20, 2011 8:44 PM
>> To: Santosh Shilimkar
>> Cc: linux-arm-kernel at lists.infradead.org; Russell King; linux-
>> omap at vger.kernel.org; Linus Walleij
>> Subject: Re: [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK
>> to speed-up boot
>>
>> On 01/20/2011 03:42 AM, Santosh Shilimkar wrote:
>>> On some architectures, secondary cores shares clock with primiary
>>> core and hence scale together. Hence secondary core lpj
>> calibration
>>> is not necessary and can be skipped to save considerable time.
>>>
>>> This can speed up the secondary cpu boot and hotplug cpu online
>>> paths.
>>>
>>> Discussion thread:
>>> http://www.spinics.net/lists/arm-kernel/msg111124.html
>>>
>>
>> There's already one way to do this with pre-calculated lpj.
>>
> How about the hot-plug path? This is not for just boot.
The path is the same for hotplug and secondary boot, so yes for both.
Plus you get the added benefit of speeding up the primary core boot as well.
>
>> Also, this isn't multi-platform friendly. You could accomplish the
>> same
>> thing using the clock api to get the core frequency of each core and
>> only calculate lpj if the frequency is different.
> May be but what's wrong with the obvious approach which is
> completely non-intrusive.
> Why is not multi-platform friendly ?
> Archs can choose not to select this option.
I meant you can't have single kernel binary with a platform with single
core freq and a platform with independent core freq.
Looking at this some more, the only reason to call calibrate_delay is to
get a more accurate value. If you have different frequencies per core,
you've got bigger problems as loops_per_jiffy value is not per core. So
your kconfig option name is misleading. Something like
ARCH_WANT_UDELAY_RECALC would be more accurate. To get a more accurate
calculation, calibrate_delay only needs to be called by 1 secondary
core. Perhaps it could be called from somewhere that is not in the
boot/hotplug path or only done once.
Rob
>
> Regards,
> Santosh
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-20 16:34 ` Rob Herring
@ 2011-01-21 13:43 ` Santosh Shilimkar
2011-01-21 14:23 ` Linus Walleij
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Santosh Shilimkar @ 2011-01-21 13:43 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: Rob Herring [mailto:robherring2 at gmail.com]
> Sent: Thursday, January 20, 2011 10:05 PM
> To: Santosh Shilimkar
> Cc: linux-arm-kernel at lists.infradead.org; Russell King; linux-
> omap at vger.kernel.org; Linus Walleij
> Subject: Re: [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK
> to speed-up boot
[..]
> >>
> >> There's already one way to do this with pre-calculated lpj.
> >>
> > How about the hot-plug path? This is not for just boot.
>
> The path is the same for hotplug and secondary boot, so yes for
> both.
> Plus you get the added benefit of speeding up the primary core boot
> as well.
>
No 'preset_lpj' will not work for the hotplug path when
cpufreq is active. It just useful only for boot in
its current form.
> >
> >> Also, this isn't multi-platform friendly. You could accomplish
> the
> >> same
> >> thing using the clock api to get the core frequency of each core
> and
> >> only calculate lpj if the frequency is different.
> > May be but what's wrong with the obvious approach which is
> > completely non-intrusive.
> > Why is not multi-platform friendly ?
> > Archs can choose not to select this option.
>
> I meant you can't have single kernel binary with a platform with
> single
> core freq and a platform with independent core freq.
>
> Looking at this some more, the only reason to call calibrate_delay
> is to
> get a more accurate value. If you have different frequencies per
> core,
> you've got bigger problems as loops_per_jiffy value is not per core.
> So
> your kconfig option name is misleading. Something like
> ARCH_WANT_UDELAY_RECALC would be more accurate. To get a more
> accurate
> calculation, calibrate_delay only needs to be called by 1 secondary
> core. Perhaps it could be called from somewhere that is not in the
> boot/hotplug path or only done once.
>
Similar name was there in my earlier version.
"ARCH_SKIP_SECONDARY_CALIBRATE"
I changed it based on Linus W suggestion. I understand your one
binary point and this patch.
How about below approach?
-----------------------------------------------------
[PATCH] ARM: smp: Skip secondary cpu calibration to speed-up boot
On some architectures, secondary cores shares clock with primiary
core and hence scale together. Hence secondary core lpj calibration
is not necessary and can be skipped to save considerable time.
This can speed up the secondary cpu boot and hotplug cpu online
paths.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
arch/arm/include/asm/smp.h | 8 ++++++++
arch/arm/kernel/smp.c | 34 ++++++++++++++++++++++++++--------
arch/arm/mach-omap2/omap-smp.c | 3 +++
3 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 96ed521..7ffdfec 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -69,6 +69,14 @@ asmlinkage void secondary_start_kernel(void);
extern void platform_secondary_init(unsigned int cpu);
/*
+ * Skip the secondary calibration on architectures sharing clock
+ * with primary cpu. Needs to be called for archs inside
+ * platform_secondary_init()
+ */
+extern void secondary_skip_calibrate(void);
+
+
+/*
* Initialize cpu_possible map, and enable coherency
*/
extern void platform_smp_prepare_cpus(unsigned int);
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 4539ebc..b20c408 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -55,6 +55,8 @@ enum ipi_msg_type {
IPI_CPU_STOP,
};
+static unsigned int skip_secondary_calibrate;
+
int __cpuinit __cpu_up(unsigned int cpu)
{
struct cpuinfo_arm *ci = &per_cpu(cpu_data, cpu);
@@ -270,6 +272,16 @@ static void __cpuinit smp_store_cpu_info(unsigned int
cpuid)
}
/*
+ * Skip the secondary calibration on architectures sharing clock
+ * with primary cpu. Needs to be called for archs from
+ * platform_secondary_init()
+ */
+void secondary_skip_calibrate(void)
+{
+ skip_secondary_calibrate = 1;
+}
+
+/*
* This is the secondary CPU boot entry. We're using this CPUs
* idle thread stack, but a set of temporary page tables.
*/
@@ -312,7 +324,8 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
*/
percpu_timer_setup();
- calibrate_delay();
+ if (!skip_secondary_calibrate)
+ calibrate_delay();
smp_store_cpu_info(cpu);
@@ -332,14 +345,19 @@ void __init smp_cpus_done(unsigned int max_cpus)
int cpu;
unsigned long bogosum = 0;
- for_each_online_cpu(cpu)
- bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
+ if (!skip_secondary_calibrate) {
+ for_each_online_cpu(cpu)
+ bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
- printk(KERN_INFO "SMP: Total of %d processors activated "
- "(%lu.%02lu BogoMIPS).\n",
- num_online_cpus(),
- bogosum / (500000/HZ),
- (bogosum / (5000/HZ)) % 100);
+ printk(KERN_INFO "SMP: Total of %d processors activated "
+ "(%lu.%02lu BogoMIPS).\n",
+ num_online_cpus(),
+ bogosum / (500000/HZ),
+ (bogosum / (5000/HZ)) % 100);
+ } else {
+ printk(KERN_INFO "SMP: Total of %d processors
activated.\n",
+ num_online_cpus());
+ }
}
void __init smp_prepare_boot_cpu(void)
diff --git a/arch/arm/mach-omap2/omap-smp.c
b/arch/arm/mach-omap2/omap-smp.c
index b66cfe8..7342cd5 100644
--- a/arch/arm/mach-omap2/omap-smp.c
+++ b/arch/arm/mach-omap2/omap-smp.c
@@ -39,6 +39,9 @@ void __cpuinit platform_secondary_init(unsigned int cpu)
*/
gic_secondary_init(0);
+ /* Allow to skip secondary CPU calibration */
+ secondary_skip_calibrate();
+
/*
* Synchronise with the boot thread.
*/
--
1.6.0.4
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ARM-smp-Skip-secondary-cpu-calibration-to-speed-up.patch
Type: application/octet-stream
Size: 3698 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20110121/8c8d8492/attachment.obj>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-21 13:43 ` Santosh Shilimkar
@ 2011-01-21 14:23 ` Linus Walleij
2011-01-21 15:00 ` Rob Herring
2011-01-21 17:08 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot Russell King - ARM Linux
2 siblings, 0 replies; 13+ messages in thread
From: Linus Walleij @ 2011-01-21 14:23 UTC (permalink / raw)
To: linux-arm-kernel
2011/1/21 Santosh Shilimkar <santosh.shilimkar@ti.com>:
> How about below approach?
>
> -----------------------------------------------------
> [PATCH] ARM: smp: Skip secondary cpu calibration to speed-up boot
Hey this looks like a good way to do it,
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-21 13:43 ` Santosh Shilimkar
2011-01-21 14:23 ` Linus Walleij
@ 2011-01-21 15:00 ` Rob Herring
2011-01-21 17:15 ` Russell King - ARM Linux
2011-01-21 17:08 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot Russell King - ARM Linux
2 siblings, 1 reply; 13+ messages in thread
From: Rob Herring @ 2011-01-21 15:00 UTC (permalink / raw)
To: linux-arm-kernel
Santosh,
On 01/21/2011 07:43 AM, Santosh Shilimkar wrote:
>> -----Original Message-----
>> From: Rob Herring [mailto:robherring2 at gmail.com]
>> Sent: Thursday, January 20, 2011 10:05 PM
>> To: Santosh Shilimkar
>> Cc: linux-arm-kernel at lists.infradead.org; Russell King; linux-
>> omap at vger.kernel.org; Linus Walleij
>> Subject: Re: [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK
>> to speed-up boot
>
> [..]
>
>>>>
>>>> There's already one way to do this with pre-calculated lpj.
>>>>
>>> How about the hot-plug path? This is not for just boot.
>>
>> The path is the same for hotplug and secondary boot, so yes for
>> both.
>> Plus you get the added benefit of speeding up the primary core boot
>> as well.
>>
> No 'preset_lpj' will not work for the hotplug path when
> cpufreq is active. It just useful only for boot in
> its current form.
Why? If preset_lpj is non-zero, calibrate_delay will effectively be
skipped which is the same thing your patch does.
This is the cpfreq loops_per_jiffy adjustment function for SMP:
static inline void adjust_jiffies(unsigned long val, struct
cpufreq_freqs *ci)
{
return;
}
And delay.S uses the global loops_per_jiffy, not the per cpu value. The
only place I see the per cpu value get used is /proc/cpuinfo.
Consider the following sequence:
- scale down the cpu freq
- hot unplug a core
- hot plug a core
- calls calibrate_delay and update the global loops_per_jiffy
- scale up the cpu freq
- udelay time is now much too short!!!
So for that reason, I would just remove calibrate_delay unconditionally.
Better to have the 1% inaccuracy and longer delays at low frequency than
to have too short of a delay at high freq.
Rob
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-21 15:00 ` Rob Herring
@ 2011-01-21 17:15 ` Russell King - ARM Linux
2011-01-22 7:44 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK tospeed-up boot Santosh Shilimkar
0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2011-01-21 17:15 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jan 21, 2011 at 09:00:04AM -0600, Rob Herring wrote:
> Why? If preset_lpj is non-zero, calibrate_delay will effectively be
> skipped which is the same thing your patch does.
Theoretically...
You boot. lpj= sets preset_lpj. calibrate_delay() sets loops_per_jiffy
based on preset_lpj which happens to match the boot CPU speed correctly.
cpufreq initializes, changes the CPU frequency. loops_per_jiffy is
scaled for the new CPU clock rate.
You hot-unplug a CPU, and then hot-plug it back in. calibrate_delay()
gets called again, and it again notices preset_lpj is set. It copies
this value into loops_per_jiffy, overwriting it with what is now a
value which is completely wrong for the current CPU run rate.
> This is the cpfreq loops_per_jiffy adjustment function for SMP:
>
> static inline void adjust_jiffies(unsigned long val, struct
> cpufreq_freqs *ci)
> {
> return;
> }
That means cpufreq scaling on SMP is broken then... Why isn't cpufreq
marked with a !SMP dependence or something similar (eg, depends on
!SMP || CPU_INDEPENDENT_UDELAY)... Maybe it should be.
> And delay.S uses the global loops_per_jiffy, not the per cpu value. The
> only place I see the per cpu value get used is /proc/cpuinfo.
>
> Consider the following sequence:
>
> - scale down the cpu freq
> - hot unplug a core
> - hot plug a core
> - calls calibrate_delay and update the global loops_per_jiffy
> - scale up the cpu freq
> - udelay time is now much too short!!!
>
> So for that reason, I would just remove calibrate_delay unconditionally.
> Better to have the 1% inaccuracy and longer delays at low frequency than
> to have too short of a delay at high freq.
As you've shown above, it makes not a blind bit of difference whether
calibrate_delay() is called... on SMP the loops_per_jiffy will be wrong
as soon as you start scaling no matter what.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK tospeed-up boot
2011-01-21 17:15 ` Russell King - ARM Linux
@ 2011-01-22 7:44 ` Santosh Shilimkar
2011-01-22 21:20 ` Russell King - ARM Linux
0 siblings, 1 reply; 13+ messages in thread
From: Santosh Shilimkar @ 2011-01-22 7:44 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-
> arm-kernel-bounces at lists.infradead.org] On Behalf Of Russell King -
> ARM Linux
> Sent: Friday, January 21, 2011 10:46 PM
> To: Rob Herring
> Cc: linux-omap at vger.kernel.org; Santosh Shilimkar; Linus Walleij;
> linux-arm-kernel at lists.infradead.org
> Subject: Re: [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK
> tospeed-up boot
>
[...]
>
> That means cpufreq scaling on SMP is broken then... Why isn't
> cpufreq
> marked with a !SMP dependence or something similar (eg, depends on
> !SMP || CPU_INDEPENDENT_UDELAY)... Maybe it should be.
>
Actually it's not broken but documented to use arch specific
per-cpu lpj with cpufreq scaling. And that update cab be
managed by arch specific cpufreq code based on the way
CPUs scale. That's how other arch including x86 doing it.
And global lpj can be update as well there which is not
done at this moment.
> > And delay.S uses the global loops_per_jiffy, not the per cpu
> value. The
> > only place I see the per cpu value get used is /proc/cpuinfo.
> >
> > Consider the following sequence:
> >
> > - scale down the cpu freq
> > - hot unplug a core
> > - hot plug a core
> > - calls calibrate_delay and update the global loops_per_jiffy
> > - scale up the cpu freq
> > - udelay time is now much too short!!!
> >
> > So for that reason, I would just remove calibrate_delay
> unconditionally.
> > Better to have the 1% inaccuracy and longer delays at low
> frequency than
> > to have too short of a delay at high freq.
>
> As you've shown above, it makes not a blind bit of difference
> whether
> calibrate_delay() is called... on SMP the loops_per_jiffy will be
> wrong
> as soon as you start scaling no matter what.
>
Surely whichever way we should remove the recalibration otherwise
every time secondary cpus spend ~ 200 ms there.
Russell,
How do you like to address this?
Regards,
Santosh
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK tospeed-up boot
2011-01-22 7:44 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK tospeed-up boot Santosh Shilimkar
@ 2011-01-22 21:20 ` Russell King - ARM Linux
2011-01-23 7:25 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCKtospeed-up boot Santosh Shilimkar
0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2011-01-22 21:20 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Jan 22, 2011 at 01:14:21PM +0530, Santosh Shilimkar wrote:
> Surely whichever way we should remove the recalibration otherwise
> every time secondary cpus spend ~ 200 ms there.
>
> Russell,
> How do you like to address this?
Well, the last piece of the puzzle which needs consideration is that
omitting the calibration means that it produces user-visible
/proc/cpuinfo changes. I can't say what effect that has. So, I
think we need to do something about that to ensure that userspace
won't be impacted should they be using the information in there.
Maybe we need to initialize the secondary CPU bogomips values
accordingly? Maybe zero is acceptable to mark that the calculation
hasn't been done? Dunno.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCKtospeed-up boot
2011-01-22 21:20 ` Russell King - ARM Linux
@ 2011-01-23 7:25 ` Santosh Shilimkar
0 siblings, 0 replies; 13+ messages in thread
From: Santosh Shilimkar @ 2011-01-23 7:25 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk]
> Sent: Sunday, January 23, 2011 2:50 AM
> To: Santosh Shilimkar
> Cc: Rob Herring; linux-omap at vger.kernel.org; Linus Walleij; linux-
> arm-kernel at lists.infradead.org
> Subject: Re: [PATCH] ARM: smp: Introduce
> ARCH_HAS_COMMON_CORES_CLOCKtospeed-up boot
>
> On Sat, Jan 22, 2011 at 01:14:21PM +0530, Santosh Shilimkar wrote:
> > Surely whichever way we should remove the recalibration otherwise
> > every time secondary cpus spend ~ 200 ms there.
> >
> > Russell,
> > How do you like to address this?
>
> Well, the last piece of the puzzle which needs consideration is that
> omitting the calibration means that it produces user-visible
> /proc/cpuinfo changes. I can't say what effect that has. So, I
> think we need to do something about that to ensure that userspace
> won't be impacted should they be using the information in there.
>
Ok. /proc/cpuinfo uses the per-cpu lpj which gets updated as mentioned
as part of the arch cpufreq drivers. So this not seems to be the
problem.
> Maybe we need to initialize the secondary CPU bogomips values
> accordingly? Maybe zero is acceptable to mark that the calculation
> hasn't been done? Dunno.
The only issue is global lpj not getting updated as part of CPU
scaling. This is anyway broken today on SMP machines even without
cpufreq. It's getting corrected as part of the hotplug path if
recalibration is carried out. And same value gets copied to
per-cpu lpj for the secondary cores.
The above gets completely wrong if the individual CPUs can
scale with different speed.
Since per-cpu lpj is suppose to be taken care by scaling
Code, /proc/cpuinfo will get updated with right information.
With this, I think we could safely skip secondary cpu
calibration.
Regards,
Santosh
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-21 13:43 ` Santosh Shilimkar
2011-01-21 14:23 ` Linus Walleij
2011-01-21 15:00 ` Rob Herring
@ 2011-01-21 17:08 ` Russell King - ARM Linux
2 siblings, 0 replies; 13+ messages in thread
From: Russell King - ARM Linux @ 2011-01-21 17:08 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jan 21, 2011 at 07:13:48PM +0530, Santosh Shilimkar wrote:
> > -----Original Message-----
> > From: Rob Herring [mailto:robherring2 at gmail.com]
> > Sent: Thursday, January 20, 2011 10:05 PM
> > To: Santosh Shilimkar
> > Cc: linux-arm-kernel at lists.infradead.org; Russell King; linux-
> > omap at vger.kernel.org; Linus Walleij
> > Subject: Re: [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK
> > to speed-up boot
>
> [..]
>
> > >>
> > >> There's already one way to do this with pre-calculated lpj.
> > >>
> > > How about the hot-plug path? This is not for just boot.
> >
> > The path is the same for hotplug and secondary boot, so yes for
> > both.
> > Plus you get the added benefit of speeding up the primary core boot
> > as well.
> >
> No 'preset_lpj' will not work for the hotplug path when
> cpufreq is active. It just useful only for boot in
> its current form.
Indeed, it will end up screwing up the loops_per_jiffy value. That
would seem to be a hole on other architectures too. I wonder if
anyone has tested hotplug on a cpufreq-scaled system.
> @@ -332,14 +345,19 @@ void __init smp_cpus_done(unsigned int max_cpus)
> int cpu;
> unsigned long bogosum = 0;
>
> - for_each_online_cpu(cpu)
> - bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
> + if (!skip_secondary_calibrate) {
> + for_each_online_cpu(cpu)
> + bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
>
> - printk(KERN_INFO "SMP: Total of %d processors activated "
> - "(%lu.%02lu BogoMIPS).\n",
> - num_online_cpus(),
> - bogosum / (500000/HZ),
> - (bogosum / (5000/HZ)) % 100);
> + printk(KERN_INFO "SMP: Total of %d processors activated "
> + "(%lu.%02lu BogoMIPS).\n",
> + num_online_cpus(),
> + bogosum / (500000/HZ),
> + (bogosum / (5000/HZ)) % 100);
> + } else {
> + printk(KERN_INFO "SMP: Total of %d processors
> activated.\n",
> + num_online_cpus());
> + }
Hmm. How about:
char bogosum[32];
if (!skip_secondary_calibrate) {
for_each_online_cpu(cpu)
bogosum += per_cpu(cpu_data, cpu).loops_per_jiffy;
snprintf(bogosum, sizeof(bogosum), " (%lu.%02lu BogoMIPS).\n",
bogosum / (500000/HZ), (bogosum / (5000/HZ)) % 100);
} else
bogosum[0] = '\0';
pr_info("SMP: Total of %d processors activated%s.\n",
num_online_cpus(), bogosum);
Looks neater and more compact and reduces the amount of string space
required.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot
2011-01-20 15:14 ` Rob Herring
2011-01-20 15:36 ` Santosh Shilimkar
@ 2011-01-20 16:21 ` Linus Walleij
1 sibling, 0 replies; 13+ messages in thread
From: Linus Walleij @ 2011-01-20 16:21 UTC (permalink / raw)
To: linux-arm-kernel
2011/1/20 Rob Herring <robherring2@gmail.com>:
> On 01/20/2011 03:42 AM, Santosh Shilimkar wrote:
>>
>> On some architectures, secondary cores shares clock with primiary
>> core and hence scale together. Hence secondary core lpj calibration
>> is not necessary and can be skipped to save considerable time.
>>
>> This can speed up the secondary cpu boot and hotplug cpu online
>> paths.
>>
>> Discussion thread:
>> ? ? ? ?http://www.spinics.net/lists/arm-kernel/msg111124.html
>>
>
> There's already one way to do this with pre-calculated lpj.
>
> Also, this isn't multi-platform friendly.
It can be made to be, then -
skip all Kconfig business (more elegant like this anyway):
static inline int skip_secondary_calibrate(void)
{
return machine_is_u8500() ||
machine_is_svp8500v1() ||
machine_is_svp8500v2() ||
machine_is_svp5500() ||
machine_is_b5500() ||
machine_is_s5500() ||
machine_is_omap4_panda();
}
Hm if this multi-platform stuff is desirable we probably need
a runtime arch_is_omap4(), arch_is_u8500() etc so this machine
list doesn't grow ridiculously long, but ftm this will work.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-01-23 7:25 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-20 9:42 [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot Santosh Shilimkar
2011-01-20 15:14 ` Rob Herring
2011-01-20 15:36 ` Santosh Shilimkar
2011-01-20 16:34 ` Rob Herring
2011-01-21 13:43 ` Santosh Shilimkar
2011-01-21 14:23 ` Linus Walleij
2011-01-21 15:00 ` Rob Herring
2011-01-21 17:15 ` Russell King - ARM Linux
2011-01-22 7:44 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK tospeed-up boot Santosh Shilimkar
2011-01-22 21:20 ` Russell King - ARM Linux
2011-01-23 7:25 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCKtospeed-up boot Santosh Shilimkar
2011-01-21 17:08 ` [PATCH] ARM: smp: Introduce ARCH_HAS_COMMON_CORES_CLOCK to speed-up boot Russell King - ARM Linux
2011-01-20 16:21 ` Linus Walleij
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).