[RFC] Fixing CPU Hotplug for RealView Platforms

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC] Fixing CPU Hotplug for RealView Platforms
@ 2010-12-07 16:43 Will Deacon
  2010-12-07 17:18 ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2010-12-07 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Currently, CPU hotplug is broken for RealView platforms. I posted some
patches previously to try and address this, but they didn't solve the
problems fully:

http://lists.infradead.org/pipermail/linux-arm-kernel/2010-September/026157.html

I'm now revisiting the code and it looks like the main problem is when
we wish to *leave* the lowpower state. The enter/leave routines look
like this:

static inline void cpu_enter_lowpower(void)
{
	unsigned int v, smp_ctrl = get_smp_ctrl_mask();

	flush_cache_all();
	dsb();
	asm volatile(
	/*
	 * Turn off coherency
	 */
	"	mrc	p15, 0, %0, c1, c0, 1\n"
	"	bic	%0, %0, %1\n"
	"	mcr	p15, 0, %0, c1, c0, 1\n"
	/* ISB */
	"	mcr	p15, 0, %2, c7, c5, 4\n"
	/* Disable D-cache */
	"	mrc	p15, 0, %0, c1, c0, 0\n"
	"	bic	%0, %0, #0x04\n"
	"	mcr	p15, 0, %0, c1, c0, 0\n"
	  : "=&r" (v)
	  : "r" (smp_ctrl), "r" (0)
	  : "memory");
	isb();
}

static inline void cpu_leave_lowpower(void)
{
	unsigned int v, smp_ctrl = get_smp_ctrl_mask();

	asm volatile(	"mrc	p15, 0, %0, c1, c0, 0\n"
	"	orr	%0, %0, #0x04\n"
	"	mcr	p15, 0, %0, c1, c0, 0\n"
	"	mrc	p15, 0, %0, c1, c0, 1\n"
	"	orr	%0, %0, %1\n"
	"	mcr	p15, 0, %0, c1, c0, 1\n"
	  : "=&r" (v)
	  : "r" (smp_ctrl)
	  : "memory");
	isb();
}

The problem is that by turning off coherency, the contents of the D-cache
becomes stale. If data is prefetched into L1 between the flush_cache_all
invocation and disabling the D-cache then this data will still be present
when we come out of lowpower. Without coherency, we *must not* use this
data and so a D-cache invalidation to the PoC is required in cpu_leave_lowpower().

On v6 this is a simple mcr instruction. On v7, we have to perform a set/way
operation across all ways of each cache until we reach the PoC (see the
scary but well commented v7_flush_dcache_all function). Implementing this
means extending the cpu_cache_fns struct and stubbing out the new function
for other caches, so I'd like to see if anybody has any better ideas before
I go ahead and make these changes.

One possibility is not to turn off coherency, but if platform_do_lowpower
is more than a WFI I don't think this would be suitable.

Any thoughts?

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-07 16:43 [RFC] Fixing CPU Hotplug for RealView Platforms Will Deacon
@ 2010-12-07 17:18 ` Russell King - ARM Linux
  2010-12-07 17:47   ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-12-07 17:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 07, 2010 at 04:43:10PM -0000, Will Deacon wrote:
> Hello,
> 
> Currently, CPU hotplug is broken for RealView platforms. I posted some
> patches previously to try and address this, but they didn't solve the
> problems fully:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2010-September/026157.html
> 
> I'm now revisiting the code and it looks like the main problem is when
> we wish to *leave* the lowpower state. The enter/leave routines look
> like this:
> 
...
> 
> The problem is that by turning off coherency, the contents of the D-cache
> becomes stale. If data is prefetched into L1 between the flush_cache_all
> invocation and disabling the D-cache then this data will still be present
> when we come out of lowpower. Without coherency, we *must not* use this
> data and so a D-cache invalidation to the PoC is required in
> cpu_leave_lowpower().

What if we fixed the cpu_reset functions for v6 and v7, and when a CPU
is taken offline, we actually go through a proper shutdown of that CPU
and call the reset vector, re-entering the boot loader?

We can only do this for CPUs other than the original boot CPU, because
the boot loader should be checking which are the secondary CPUs and
putting those into this simple WFI loop with the GIC appropriately
programmed.

This means when we re-activate the CPU, we'll be waking it up in
exactly the same way as we do when the kernel boots - and we have all
that code around just waiting to be used.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-07 17:18 ` Russell King - ARM Linux
@ 2010-12-07 17:47   ` Will Deacon
  2010-12-08  6:03     ` Santosh Shilimkar
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Will Deacon @ 2010-12-07 17:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

> On Tue, Dec 07, 2010 at 04:43:10PM -0000, Will Deacon wrote:
> > Hello,
> >
> > Currently, CPU hotplug is broken for RealView platforms. I posted some
> > patches previously to try and address this, but they didn't solve the
> > problems fully:
> >
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2010-September/026157.html
> >
> > I'm now revisiting the code and it looks like the main problem is when
> > we wish to *leave* the lowpower state. The enter/leave routines look
> > like this:
> >
> ...
> >
> > The problem is that by turning off coherency, the contents of the D-cache
> > becomes stale. If data is prefetched into L1 between the flush_cache_all
> > invocation and disabling the D-cache then this data will still be present
> > when we come out of lowpower. Without coherency, we *must not* use this
> > data and so a D-cache invalidation to the PoC is required in
> > cpu_leave_lowpower().
> 
> What if we fixed the cpu_reset functions for v6 and v7, and when a CPU
> is taken offline, we actually go through a proper shutdown of that CPU
> and call the reset vector, re-entering the boot loader?

This will certainly solve our problem, but people might complain that it's
too heavyweight :) That said, for v7 this may be the only solution as the
platform can require an IMPLEMENTATION DEFINED initialisation routine to be
executed before enabling the D-cache out of reset. This should be something
that the bootloader does for us.

> We can only do this for CPUs other than the original boot CPU, because
> the boot loader should be checking which are the secondary CPUs and
> putting those into this simple WFI loop with the GIC appropriately
> programmed.
> 
> This means when we re-activate the CPU, we'll be waking it up in
> exactly the same way as we do when the kernel boots - and we have all
> that code around just waiting to be used.

As long as the bootloader doesn't mind, then this should work.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-07 17:47   ` Will Deacon
@ 2010-12-08  6:03     ` Santosh Shilimkar
  2010-12-08 13:20       ` Will Deacon
  2010-12-08 20:20     ` Russell King - ARM Linux
  2010-12-18 17:10     ` Russell King - ARM Linux
  2 siblings, 1 reply; 14+ messages in thread
From: Santosh Shilimkar @ 2010-12-08  6:03 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm-
> kernel-bounces at lists.infradead.org] On Behalf Of Will Deacon
> Sent: Tuesday, December 07, 2010 11:17 PM
> To: 'Russell King - ARM Linux'
> Cc: linux-arm-kernel at lists.infradead.org
> Subject: RE: [RFC] Fixing CPU Hotplug for RealView Platforms
>
> Hi Russell,
>
> > On Tue, Dec 07, 2010 at 04:43:10PM -0000, Will Deacon wrote:
> > > Hello,
> > >
> > > Currently, CPU hotplug is broken for RealView platforms. I posted
some
> > > patches previously to try and address this, but they didn't solve
the
> > > problems fully:
> > >
> > > http://lists.infradead.org/pipermail/linux-arm-kernel/2010-
> September/026157.html
> > >
> > > I'm now revisiting the code and it looks like the main problem is
when
> > > we wish to *leave* the lowpower state. The enter/leave routines look
> > > like this:
> > >
> > ...
> > >
> > > The problem is that by turning off coherency, the contents of the D-
> cache
> > > becomes stale. If data is prefetched into L1 between the
> flush_cache_all
> > > invocation and disabling the D-cache then this data will still be
> present
> > > when we come out of lowpower. Without coherency, we *must not* use
> this
> > > data and so a D-cache invalidation to the PoC is required in
> > > cpu_leave_lowpower().
> >
> > What if we fixed the cpu_reset functions for v6 and v7, and when a CPU
> > is taken offline, we actually go through a proper shutdown of that CPU
> > and call the reset vector, re-entering the boot loader?
>
> This will certainly solve our problem, but people might complain that
it's
> too heavyweight :) That said, for v7 this may be the only solution as
the
> platform can require an IMPLEMENTATION DEFINED initialisation routine to
> be
> executed before enabling the D-cache out of reset. This should be
> something
> that the bootloader does for us.
>
> > We can only do this for CPUs other than the original boot CPU, because
> > the boot loader should be checking which are the secondary CPUs and
> > putting those into this simple WFI loop with the GIC appropriately
> > programmed.
> >
> > This means when we re-activate the CPU, we'll be waking it up in
> > exactly the same way as we do when the kernel boots - and we have all
> > that code around just waiting to be used.
>
One more simpler thing which could work is disable "C' bit before flushing
the L1 cache. That way prefetch would be avoided and cache also will
be in clean state while restarting the core.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-08  6:03     ` Santosh Shilimkar
@ 2010-12-08 13:20       ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2010-12-08 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Santosh,

> > > We can only do this for CPUs other than the original boot CPU, because
> > > the boot loader should be checking which are the secondary CPUs and
> > > putting those into this simple WFI loop with the GIC appropriately
> > > programmed.
> > >
> > > This means when we re-activate the CPU, we'll be waking it up in
> > > exactly the same way as we do when the kernel boots - and we have all
> > > that code around just waiting to be used.
> >
> One more simpler thing which could work is disable "C' bit before flushing
> the L1 cache. That way prefetch would be avoided and cache also will
> be in clean state while restarting the core.

I like this idea because it's easy to implement! It does, however, rely
on caches not containing any random dirty lines when leaving the low-power
state. This behaviour is IMPLEMENTATION DEFINED out of reset, so its
something that platform code will need to handle anyway.

On RealView, we only do a WFI to enter lowpower so your approach sounds
feasible. I'll put in a comment describing the potential problems with
random D-cache data out of reset so that if other platforms blindly copy
the code, at least they've been warned.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-07 17:47   ` Will Deacon
  2010-12-08  6:03     ` Santosh Shilimkar
@ 2010-12-08 20:20     ` Russell King - ARM Linux
  2010-12-18 17:10     ` Russell King - ARM Linux
  2 siblings, 0 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-12-08 20:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 07, 2010 at 05:47:00PM -0000, Will Deacon wrote:
> Hi Russell,
> > What if we fixed the cpu_reset functions for v6 and v7, and when a CPU
> > is taken offline, we actually go through a proper shutdown of that CPU
> > and call the reset vector, re-entering the boot loader?
> 
> This will certainly solve our problem, but people might complain that it's
> too heavyweight :)

Our current hot plug-in code already re-runs most of the secondary CPU
initialization in the kernel - we reset the stack and jump back to
secondary_start_kernel().

What we'll be adding to that is the overheads in the boot loader (which
would happen at the point we go offline) and getting back to the kernel
code.  From that point, the additional things are looking up the CPU
type, calling its setup function, and enabling the MMU.

Everything from that point on would be identical to what already happens
on the hot plug-in path.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-07 17:47   ` Will Deacon
  2010-12-08  6:03     ` Santosh Shilimkar
  2010-12-08 20:20     ` Russell King - ARM Linux
@ 2010-12-18 17:10     ` Russell King - ARM Linux
  2010-12-18 17:44       ` Will Deacon
  2 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-12-18 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 07, 2010 at 05:47:00PM -0000, Will Deacon wrote:
> Hi Russell,
> > What if we fixed the cpu_reset functions for v6 and v7, and when a CPU
> > is taken offline, we actually go through a proper shutdown of that CPU
> > and call the reset vector, re-entering the boot loader?
> 
> This will certainly solve our problem, but people might complain that it's
> too heavyweight :)

Well, I've taken some measurements from the CPU boot, and there appears
to be some interesting behaviour here:

Boot time bringup:

				   boot CPU	CPU1
Booting: 1084			-> 0ns		0ns	(about 1us per print)
cross call: 21750		-> 21.75us
Up: 267167			->		267.167us
CPU1: Booted secondary processor
secondary_init: 297834		->		297.834us
writing release: 310917		->		310.917us
release done: 320334		->		320.334us
released: 327750		-> 327.75us
Boot returned: 342917		-> 342.917us
sync'd: 343167			->		343.167us
CPU1: Unknown IPI message 0x1
Online: 218416334		->		218.416334ms

This looks reasonable - 300us taken to get from requesting the CPU to boot
in __cpu_up() to the CPU marking itself online.

The 218ms will be down to calibrate_delay().

CPU2 and CPU3 have very similar boot timings, so I'm pretty happy that
this timing is reliable.

Hotplug bringup:

Booting: 1000			-> 0ns		0ns		(1us per print)
Restarting: 3976375		->		3.976375ms
cross call: 3976625		-> 3.976625ms
Up: 4003125			->		4.003125ms
CPU1: Booted secondary processor
secondary_init: 4022583		->		4.022583ms
writing release: 4040750	->		4.04075ms
release done: 4051083		->		4.051083ms
released: 46509000		-> 4.6509ms
Boot returned: 51745708		-> 5.1745708ms
sync'd: 51745875		->		5.1745875ms
CPU1: Unknown IPI message 0x1
Switched to NOHz mode on CPU #1
Online: 281251041		->		281.251041ms

So, it appears to take 4ms to get from just before the call to
boot_secondary() in __cpu_up() to writing pen_release.

The secondary CPU appears to run from being woken up to writing the
pen release in about 40us - and then spends about 1ms spinning on
its lock waiting for the requesting CPU to catch up.

This can be repeated every time without exception when you bring a
CPU back online.

Looking at that 500us, it seems to be taken up by 'spin_unlock()' in
boot_secondary:

00000000 <boot_secondary>:
  a0:   ebfffffe        bl      0 <sched_clock>
  a4:   e59f3044        ldr     r3, [pc, #68]   ; f0 <boot_secondary+0xf0>
  a8:   e893000c        ldm     r3, {r2, r3}
  ac:   e0502002        subs    r2, r0, r2
  b0:   e0c13003        sbc     r3, r1, r3
  b4:   e59f004c        ldr     r0, [pc, #76]   ; 108 <boot_secondary+0x108>
  b8:   ebfffffe        bl      0 <printk>	; "released: %llu\n"
--spin_unlock--
  bc:   f57ff05f        dmb     sy
  c0:   e3a02000        mov     r2, #0  ; 0x0
  c4:   e59f3020        ldr     r3, [pc, #32]   ; ec <boot_secondary+0xec>
  c8:   e5832000        str     r2, [r3]
  cc:   f57ff04f        dsb     sy
  d0:   e320f004        sev
----
  d4:   e59f3018        ldr     r3, [pc, #24]   ; f4 <boot_secondary+0xf4>
  d8:   e5933000        ldr     r3, [r3]	; read pen_release
  dc:   e3730001        cmn     r3, #1  ; 0x1	; == -1?
  e0:   13e00025        mvnne   r0, #37 ; 0x25	; != -1 => -ENOSYS
  e4:   01a00002        moveq   r0, r2		; == -1 => 0
  e8:   e99da870        ldmib   sp, {r4, r5, r6, fp, sp, pc}
...
000001d8 <__cpu_up>:
 2dc:   ebfffffe        bl      0 <boot_secondary>
 2e0:   e1a05000        mov     r5, r0
 2e4:   ebfffffe        bl      0 <sched_clock>
 2e8:   e894000c        ldm     r4, {r2, r3}	; boot_start
 2ec:   e0502002        subs    r2, r0, r2	; sched_clock() - boot_start
 2f0:   e0c13003        sbc     r3, r1, r3
 2f4:   e59f0128        ldr     r0, [pc, #296]  ; 424 <__cpu_up+0x24c>
 2f8:   ebfffffe        bl      0 <printk>	; "Boot returned: %llu\n"

So there's not much going on in that path.

The CPU being brought online is doing this:

00000034 <_raw_spin_lock>:
  34:   e1a0c00d        mov     ip, sp
  38:   e92dd800        push    {fp, ip, lr, pc}
  3c:   e24cb004        sub     fp, ip, #4      ; 0x4
  40:   e3a03001        mov     r3, #1  ; 0x1
  44:   e1902f9f        ldrex   r2, [r0]
  48:   e3320000        teq     r2, #0  ; 0x0
  4c:   1320f002        wfene
  50:   01802f93        strexeq r2, r3, [r0]
  54:   03320000        teqeq   r2, #0  ; 0x0
  58:   1afffff9        bne     44 <_raw_spin_lock+0x10>
  5c:   f57ff05f        dmb     sy
  60:   e89da800        ldm     sp, {fp, sp, pc}

as it's waiting for the lock to be released.  So... what could be causing
the above code in boot_secondary()/__cpu_up() to take 500us when the
system's running?  The dmb, dsb, or sev?  Or the SCU trying to sort out
the str to release the lock?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-18 17:10     ` Russell King - ARM Linux
@ 2010-12-18 17:44       ` Will Deacon
  2010-12-18 19:22         ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2010-12-18 17:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

Thanks for looking into this.

On Sat, 2010-12-18 at 17:10 +0000, Russell King - ARM Linux wrote:
> Boot time bringup:
> 
[...]

> CPU2 and CPU3 have very similar boot timings, so I'm pretty happy that
> this timing is reliable.
> 
Looks sane.

> Hotplug bringup:
> 
> Booting: 1000                   -> 0ns          0ns             (1us per print)
> Restarting: 3976375             ->              3.976375ms
> cross call: 3976625             -> 3.976625ms
> Up: 4003125                     ->              4.003125ms
> CPU1: Booted secondary processor
> secondary_init: 4022583         ->              4.022583ms
> writing release: 4040750        ->              4.04075ms
> release done: 4051083           ->              4.051083ms
> released: 46509000              -> 4.6509ms
> Boot returned: 51745708         -> 5.1745708ms
> sync'd: 51745875                ->              5.1745875ms
> CPU1: Unknown IPI message 0x1
> Switched to NOHz mode on CPU #1
> Online: 281251041               ->              281.251041ms
> 
> So, it appears to take 4ms to get from just before the call to
> boot_secondary() in __cpu_up() to writing pen_release.
> 
> The secondary CPU appears to run from being woken up to writing the
> pen release in about 40us - and then spends about 1ms spinning on
> its lock waiting for the requesting CPU to catch up.
> 
> This can be repeated every time without exception when you bring a
> CPU back online.
> 
Hmm, this sounds needlessly expensive.

> Looking at that 500us, it seems to be taken up by 'spin_unlock()' in
> boot_secondary:
> 
> 00000000 <boot_secondary>:

[...]

> --spin_unlock--
>   bc:   f57ff05f        dmb     sy
>   c0:   e3a02000        mov     r2, #0  ; 0x0
>   c4:   e59f3020        ldr     r3, [pc, #32]   ; ec <boot_secondary+0xec>
>   c8:   e5832000        str     r2, [r3]
>   cc:   f57ff04f        dsb     sy
>   d0:   e320f004        sev
> ----

One thing that might be worth trying is changing spin_unlock to use
strex [alongside a dummy ldrex]. There could be some QoS logic at L2
which favours exclusive accesses, meaning that the unlock is starved by
the lock. I don't have access to a board at the moment, so this is
purely speculation!

> The CPU being brought online is doing this:
> 
> 00000034 <_raw_spin_lock>:
>   34:   e1a0c00d        mov     ip, sp
>   38:   e92dd800        push    {fp, ip, lr, pc}
>   3c:   e24cb004        sub     fp, ip, #4      ; 0x4
>   40:   e3a03001        mov     r3, #1  ; 0x1
>   44:   e1902f9f        ldrex   r2, [r0]
>   48:   e3320000        teq     r2, #0  ; 0x0
>   4c:   1320f002        wfene
>   50:   01802f93        strexeq r2, r3, [r0]
>   54:   03320000        teqeq   r2, #0  ; 0x0
>   58:   1afffff9        bne     44 <_raw_spin_lock+0x10>
>   5c:   f57ff05f        dmb     sy
>   60:   e89da800        ldm     sp, {fp, sp, pc}
> 
> as it's waiting for the lock to be released.  So... what could be causing
> the above code in boot_secondary()/__cpu_up() to take 500us when the
> system's running?  The dmb, dsb, or sev?  Or the SCU trying to sort out
> the str to release the lock?

Another experiment would be to remove the wfe/sev instructions to see if
they're eating cycles. I think a WFE on the A9 disables a bunch of
clocks, so that could be taking time to do.

<shameless plug>
You could try using perf to identify the most expensive instructions in
the functions above (assuming interrupts are enabled).
</shameless plug>

Cheers,

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-18 17:44       ` Will Deacon
@ 2010-12-18 19:22         ` Russell King - ARM Linux
  0 siblings, 0 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-12-18 19:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Dec 18, 2010 at 05:44:47PM +0000, Will Deacon wrote:
> > Hotplug bringup:
> > 
> > Booting: 1000                   -> 0ns          0ns             (1us per print)
> > Restarting: 3976375             ->              3.976375ms
> > cross call: 3976625             -> 3.976625ms
> > Up: 4003125                     ->              4.003125ms
> > CPU1: Booted secondary processor
> > secondary_init: 4022583         ->              4.022583ms
> > writing release: 4040750        ->              4.04075ms
> > release done: 4051083           ->              4.051083ms
> > released: 46509000              -> 4.6509ms
> > Boot returned: 51745708         -> 5.1745708ms
> > sync'd: 51745875                ->              5.1745875ms
> > CPU1: Unknown IPI message 0x1
> > Switched to NOHz mode on CPU #1
> > Online: 281251041               ->              281.251041ms
> > 
> > So, it appears to take 4ms to get from just before the call to
> > boot_secondary() in __cpu_up() to writing pen_release.
> > 
> > The secondary CPU appears to run from being woken up to writing the
> > pen release in about 40us - and then spends about 1ms spinning on
> > its lock waiting for the requesting CPU to catch up.
> > 
> > This can be repeated every time without exception when you bring a
> > CPU back online.
> > 
> Hmm, this sounds needlessly expensive.

Actually, I'm starting to get concerned about doing timing measurements
on Versatile Express - I'm seeing some unexplainable issues with the
Versatile Express platform.

I occasionally see the kernel get stuck when initializing the CLCD - and
I think this is a hardware lockup - pressing the red 'reset/power on'
button is ignored, and the only way to recover it is to press the
black 'power off' button first.

Also I keep running into some weird stuff which causes the MMC to
underflow, serial output to be corrupted, and rootfs not to be mounted
which is 100% reliable with some kernels (iow, the built kernel just
will not boot no matter how many times you attempt to do so.)  I've
sent Catalin & Philippe a copy of one such kernel which exhibits this
behaviour a few days ago (but I think they're on holiday.)

Anyway, I decided to implement a slightly different method to measuring
the time taken, and the apparant long delays have gone - I suspect that
was something to do with printk.  I'm not logging the times into an
array, and later printing out the values.

So, CPU1 boot:

SMP: Start: 0
SMP: Booting: 916
SMP: Cross call: 3083
SMP: Pen released: 278416
SMP: Unlock: 279583
SMP: Boot returned: 280333

SMP: Sec: up: 238666
SMP: Sec: enter: 264333
SMP: Sec: pen write: 267083
SMP: Sec: pen done: 268916
SMP: Sec: exit: 279916
SMP: Sec: calibrate: 328416
SMP: Sec: online: 218380875

CPU1 hotplug:
SMP: Start: 0
SMP: Booting: 833
SMP: Cross call: 4250
SMP: Pen released: 51500
SMP: Unlock: 52667
SMP: Boot returned: 53500

SMP: Sec: restart: 4667
SMP: Sec: up: 7167
SMP: Sec: enter: 31000
SMP: Sec: pen write: 39667
SMP: Sec: pen done: 42167
SMP: Sec: exit: 53000
SMP: Sec: calibrate: 104583
SMP: Sec: online: 221423333

This looks far saner.

Anyway, with the delay loop calibration, we're looking at a boot time of
about 110us to the delay loop calibration, and 221ms for a secondary CPU
using the existing code.  I don't think that will go up significantly if
we re-vector offlined CPUs back through the reset vector.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
@ 2010-12-20  8:16 Vincent Guittot
  2011-01-03 10:46 ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2010-12-20  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

I'm also interested in hotplug latency measurement and have done some
on my CA9 platform u8500. I have the same kind of result for plugging
a secondary cpu:
  total duration = 295ms
  166 us for the low level cpu wake up
  228ms between the return from platform_cpu_die and the cpu becomes online

I have added some trace events for doing these measurements and I'd
like to add some generic traces point in the cpu hotplug code like we
already have in power management code (cpuidle, suspend, cpufreq ...)
These traces could be used with power events for studying the impact
of cpu hotplug in the complete power management scheme.



> Message: 4
> Date: Sat, 18 Dec 2010 19:22:13 +0000
> From: Russell King - ARM Linux <linux@arm.linux.org.uk>
> To: Will Deacon <will.deacon@arm.com>
> Cc: linux-arm-kernel at lists.infradead.org
> Subject: Re: [RFC] Fixing CPU Hotplug for RealView Platforms
> Message-ID: <20101218192213.GL9937@n2100.arm.linux.org.uk>
> Content-Type: text/plain; charset=us-ascii
>
> On Sat, Dec 18, 2010 at 05:44:47PM +0000, Will Deacon wrote:
>> > Hotplug bringup:
>> >
>> > Booting: 1000 ? ? ? ? ? ? ? ? ? -> 0ns ? ? ? ? ?0ns ? ? ? ? ? ? (1us per print)
>> > Restarting: 3976375 ? ? ? ? ? ? -> ? ? ? ? ? ? ?3.976375ms
>> > cross call: 3976625 ? ? ? ? ? ? -> 3.976625ms
>> > Up: 4003125 ? ? ? ? ? ? ? ? ? ? -> ? ? ? ? ? ? ?4.003125ms
>> > CPU1: Booted secondary processor
>> > secondary_init: 4022583 ? ? ? ? -> ? ? ? ? ? ? ?4.022583ms
>> > writing release: 4040750 ? ? ? ?-> ? ? ? ? ? ? ?4.04075ms
>> > release done: 4051083 ? ? ? ? ? -> ? ? ? ? ? ? ?4.051083ms
>> > released: 46509000 ? ? ? ? ? ? ?-> 4.6509ms
>> > Boot returned: 51745708 ? ? ? ? -> 5.1745708ms
>> > sync'd: 51745875 ? ? ? ? ? ? ? ?-> ? ? ? ? ? ? ?5.1745875ms
>> > CPU1: Unknown IPI message 0x1
>> > Switched to NOHz mode on CPU #1
>> > Online: 281251041 ? ? ? ? ? ? ? -> ? ? ? ? ? ? ?281.251041ms
>> >
>> > So, it appears to take 4ms to get from just before the call to
>> > boot_secondary() in __cpu_up() to writing pen_release.
>> >
>> > The secondary CPU appears to run from being woken up to writing the
>> > pen release in about 40us - and then spends about 1ms spinning on
>> > its lock waiting for the requesting CPU to catch up.
>> >
>> > This can be repeated every time without exception when you bring a
>> > CPU back online.
>> >
>> Hmm, this sounds needlessly expensive.
>
> Actually, I'm starting to get concerned about doing timing measurements
> on Versatile Express - I'm seeing some unexplainable issues with the
> Versatile Express platform.
>
> I occasionally see the kernel get stuck when initializing the CLCD - and
> I think this is a hardware lockup - pressing the red 'reset/power on'
> button is ignored, and the only way to recover it is to press the
> black 'power off' button first.
>
> Also I keep running into some weird stuff which causes the MMC to
> underflow, serial output to be corrupted, and rootfs not to be mounted
> which is 100% reliable with some kernels (iow, the built kernel just
> will not boot no matter how many times you attempt to do so.) ?I've
> sent Catalin & Philippe a copy of one such kernel which exhibits this
> behaviour a few days ago (but I think they're on holiday.)
>
> Anyway, I decided to implement a slightly different method to measuring
> the time taken, and the apparant long delays have gone - I suspect that
> was something to do with printk. ?I'm not logging the times into an
> array, and later printing out the values.
>
> So, CPU1 boot:
>
> SMP: Start: 0
> SMP: Booting: 916
> SMP: Cross call: 3083
> SMP: Pen released: 278416
> SMP: Unlock: 279583
> SMP: Boot returned: 280333
>
> SMP: Sec: up: 238666
> SMP: Sec: enter: 264333
> SMP: Sec: pen write: 267083
> SMP: Sec: pen done: 268916
> SMP: Sec: exit: 279916
> SMP: Sec: calibrate: 328416
> SMP: Sec: online: 218380875
>
> CPU1 hotplug:
> SMP: Start: 0
> SMP: Booting: 833
> SMP: Cross call: 4250
> SMP: Pen released: 51500
> SMP: Unlock: 52667
> SMP: Boot returned: 53500
>
> SMP: Sec: restart: 4667
> SMP: Sec: up: 7167
> SMP: Sec: enter: 31000
> SMP: Sec: pen write: 39667
> SMP: Sec: pen done: 42167
> SMP: Sec: exit: 53000
> SMP: Sec: calibrate: 104583
> SMP: Sec: online: 221423333
>
> This looks far saner.
>
> Anyway, with the delay loop calibration, we're looking at a boot time of
> about 110us to the delay loop calibration, and 221ms for a secondary CPU
> using the existing code. ?I don't think that will go up significantly if
> we re-vector offlined CPUs back through the reset vector.
>
>
>
> ------------------------------

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2010-12-20  8:16 Vincent Guittot
@ 2011-01-03 10:46 ` Russell King - ARM Linux
  2011-01-03 17:39   ` Vincent Guittot
  0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2011-01-03 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 20, 2010 at 09:16:15AM +0100, Vincent Guittot wrote:
> I'm also interested in hotplug latency measurement and have done some
> on my CA9 platform u8500. I have the same kind of result for plugging
> a secondary cpu:
>   total duration = 295ms
>   166 us for the low level cpu wake up
>   228ms between the return from platform_cpu_die and the cpu becomes online
> 
> I have added some trace events for doing these measurements and I'd
> like to add some generic traces point in the cpu hotplug code like we
> already have in power management code (cpuidle, suspend, cpufreq ...)
> These traces could be used with power events for studying the impact
> of cpu hotplug in the complete power management scheme.

Note that if you pass lpj=<number> to the kernel, you'll bypass the
calibration and have a faster response to CPU onlining.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2011-01-03 10:46 ` Russell King - ARM Linux
@ 2011-01-03 17:39   ` Vincent Guittot
  2011-01-03 18:03     ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2011-01-03 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 January 2011 11:46, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Dec 20, 2010 at 09:16:15AM +0100, Vincent Guittot wrote:
>> I'm also interested in hotplug latency measurement and have done some
>> on my CA9 platform u8500. I have the same kind of result for plugging
>> a secondary cpu:
>> ? total duration = 295ms
>> ? 166 us for the low level cpu wake up
>> ? 228ms between the return from platform_cpu_die and the cpu becomes online
>>
>> I have added some trace events for doing these measurements and I'd
>> like to add some generic traces point in the cpu hotplug code like we
>> already have in power management code (cpuidle, suspend, cpufreq ...)
>> These traces could be used with power events for studying the impact
>> of cpu hotplug in the complete power management scheme.
>
> Note that if you pass lpj=<number> to the kernel, you'll bypass the
> calibration and have a faster response to CPU onlining.
>

yes, the total duration decreases down to 40ms

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2011-01-03 17:39   ` Vincent Guittot
@ 2011-01-03 18:03     ` Russell King - ARM Linux
  2011-01-04  8:55       ` Vincent Guittot
  0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2011-01-03 18:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 03, 2011 at 06:39:56PM +0100, Vincent Guittot wrote:
> On 3 January 2011 11:46, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Mon, Dec 20, 2010 at 09:16:15AM +0100, Vincent Guittot wrote:
> >> I'm also interested in hotplug latency measurement and have done some
> >> on my CA9 platform u8500. I have the same kind of result for plugging
> >> a secondary cpu:
> >> ? total duration = 295ms
> >> ? 166 us for the low level cpu wake up
> >> ? 228ms between the return from platform_cpu_die and the cpu becomes online
> >>
> >> I have added some trace events for doing these measurements and I'd
> >> like to add some generic traces point in the cpu hotplug code like we
> >> already have in power management code (cpuidle, suspend, cpufreq ...)
> >> These traces could be used with power events for studying the impact
> >> of cpu hotplug in the complete power management scheme.
> >
> > Note that if you pass lpj=<number> to the kernel, you'll bypass the
> > calibration and have a faster response to CPU onlining.
> >
> 
> yes, the total duration decreases down to 40ms

I'm not sure that I believe it takes 40ms, as it's only taking about
104us to get to the calibration for me.  The calibration when disabled
is virtually a do-nothing, so shouldn't be taking 40ms.

The trace code may be hitting locks which are interfering with the
timing - this below is entirely lockless with a proper (and working)
sched_clock() implementation.

See mach-vexpress/platsmp.c for the things to add.  Also note that it
requires the SMP changes (and for good measure, the clksrc stuff too.)
Might be easier to apply on top of linux-next.

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 11d6a94..0b3a15a 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -40,6 +40,7 @@
 #include <asm/ptrace.h>
 #include <asm/localtimer.h>
 
+#include <asm/smp-debug.h>
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
  * so we need some other way of telling a new secondary core
@@ -47,6 +48,8 @@
  */
 struct secondary_data secondary_data;
 
+struct smp_debug smp_debug;
+
 enum ipi_msg_type {
 	IPI_TIMER = 2,
 	IPI_RESCHEDULE,
@@ -55,6 +58,24 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 };
 
+static const char *debug_names[] = {
+	[D_START]		= "Start",
+	[D_BOOT_SECONDARY_CALL]	= "Booting",
+	[D_CROSS_CALL]		= "Cross call",
+	[D_PEN_RELEASED]	= "Pen released",
+	[D_UNLOCK]		= "Unlock",
+	[D_BOOT_SECONDARY_RET]	= "Boot returned",
+	[D_BOOT_DONE]		= "Boot complete",
+	[D_SEC_RESTART]		= "Sec: restart",
+	[D_SEC_UP]		= "Sec: up",
+	[D_SEC_PLAT_ENTER]	= "Sec: enter",
+	[D_SEC_PLAT_PEN_WRITE]	= "Sec: pen write",
+	[D_SEC_PLAT_PEN_DONE]	= "Sec: pen done",
+	[D_SEC_PLAT_EXIT]	= "Sec: exit",
+	[D_SEC_CALIBRATE]	= "Sec: calibrate",
+	[D_SEC_ONLINE]		= "Sec: online",
+};
+
 int __cpuinit __cpu_up(unsigned int cpu)
 {
 	struct cpuinfo_arm *ci = &per_cpu(cpu_data, cpu);
@@ -111,7 +132,10 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	/*
 	 * Now bring the CPU into our world.
 	 */
+	smp_debug_mark(D_START);
+	smp_debug_mark(D_BOOT_SECONDARY_CALL);
 	ret = boot_secondary(cpu, idle);
+	smp_debug_mark(D_BOOT_SECONDARY_RET);
 	if (ret == 0) {
 		unsigned long timeout;
 
@@ -135,6 +159,7 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	} else {
 		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
 	}
+	smp_debug_mark(D_BOOT_DONE);
 
 	secondary_data.stack = NULL;
 	secondary_data.pgdir = 0;
@@ -148,6 +173,14 @@ int __cpuinit __cpu_up(unsigned int cpu)
 	}
 
 	pgd_free(&init_mm, pgd);
+{
+	int i;
+	for (i = 0; i < D_NUM; i++) {
+		if (debug_names[i] && smp_debug.entry[i])
+			pr_info("SMP: %s: %llu\n", debug_names[i],
+				smp_debug.entry[i] - smp_debug.entry[0]);
+	}
+}
 
 	return ret;
 }
@@ -245,6 +278,8 @@ void __ref cpu_die(void)
 	 */
 	platform_cpu_die(cpu);
 
+	smp_debug_mark(D_SEC_RESTART);
+
 	/*
 	 * Do not return to the idle loop - jump back to the secondary
 	 * cpu initialisation.  There's some initialisation which needs
@@ -278,6 +313,7 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
 	struct mm_struct *mm = &init_mm;
 	unsigned int cpu = smp_processor_id();
 
+	smp_debug_mark(D_SEC_UP);
 	printk("CPU%u: Booted secondary processor\n", cpu);
 
 	/*
@@ -312,6 +348,8 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
 	 */
 	percpu_timer_setup();
 
+	smp_debug_mark(D_SEC_CALIBRATE);
+
 	calibrate_delay();
 
 	smp_store_cpu_info(cpu);
@@ -320,6 +358,7 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
 	 * OK, now it's safe to let the boot CPU continue
 	 */
 	set_cpu_online(cpu, true);
+	smp_debug_mark(D_SEC_ONLINE);
 
 	/*
 	 * OK, it's off to the idle thread for us
diff --git a/arch/arm/mach-vexpress/platsmp.c b/arch/arm/mach-vexpress/platsmp.c
index b1687b6..0a77a6b 100644
--- a/arch/arm/mach-vexpress/platsmp.c
+++ b/arch/arm/mach-vexpress/platsmp.c
@@ -27,7 +27,7 @@
 #include "core.h"
 
 extern void vexpress_secondary_startup(void);
-
+#include <asm/smp-debug.h>
 /*
  * control for which core is the next to come out of the secondary
  * boot "holding pen"
@@ -56,6 +56,7 @@ static DEFINE_SPINLOCK(boot_lock);
 
 void __cpuinit platform_secondary_init(unsigned int cpu)
 {
+	smp_debug_mark(D_SEC_PLAT_ENTER);
 	/*
 	 * if any interrupts are already enabled for the primary
 	 * core (e.g. timer irq), then they will not have been enabled
@@ -67,13 +68,16 @@ void __cpuinit platform_secondary_init(unsigned int cpu)
 	 * let the primary processor know we're out of the
 	 * pen, then head off into the C entry point
 	 */
+	smp_debug_mark(D_SEC_PLAT_PEN_WRITE);
 	write_pen_release(-1);
+	smp_debug_mark(D_SEC_PLAT_PEN_DONE);
 
 	/*
 	 * Synchronise with the boot thread.
 	 */
 	spin_lock(&boot_lock);
 	spin_unlock(&boot_lock);
+	smp_debug_mark(D_SEC_PLAT_EXIT);
 }
 
 int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
@@ -99,6 +103,7 @@ int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
 	 * the boot monitor to read the system wide flags register,
 	 * and branch to the address found there.
 	 */
+	smp_debug_mark(D_CROSS_CALL);
 	smp_cross_call(cpumask_of(cpu), 1);
 
 	timeout = jiffies + (1 * HZ);
@@ -109,12 +114,14 @@ int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
 
 		udelay(10);
 	}
+	smp_debug_mark(D_PEN_RELEASED);
 
 	/*
 	 * now the secondary core is starting up let it run its
 	 * calibrations, then wait for it to finish
 	 */
 	spin_unlock(&boot_lock);
+	smp_debug_mark(D_UNLOCK);
 
 	return pen_release != -1 ? -ENOSYS : 0;
 }
--- /dev/null	2010-08-07 18:16:05.574112050 +0100
+++ arch/arm/include/asm/smp-debug.h	2010-12-18 22:03:23.622580304 +0000
@@ -0,0 +1,39 @@
+#include <linux/sched.h>
+
+enum {
+	D_START,
+	D_INIT_IDLE,
+	D_PGD_ALLOC,
+	D_IDMAP_ADD,
+	D_BOOT_SECONDARY_CALL,
+	D_CROSS_CALL,
+	D_PEN_RELEASED,
+	D_UNLOCK,
+	D_BOOT_SECONDARY_RET,
+	D_BOOT_DONE,
+	D_IDMAP_DEL,
+	D_PGD_FREE,
+
+	D_SEC = 16,
+	D_SEC_RESTART = D_SEC,
+	D_SEC_UP,
+	D_SEC_PLAT_ENTER,
+	D_SEC_PLAT_PEN_WRITE,
+	D_SEC_PLAT_PEN_DONE,
+	D_SEC_PLAT_EXIT,
+	D_SEC_CALIBRATE,
+	D_SEC_ONLINE,
+	
+	D_NUM,
+};
+
+struct smp_debug {
+	unsigned long long	entry[D_NUM];
+};
+
+extern struct smp_debug smp_debug;
+
+static inline void smp_debug_mark(int ent)
+{
+	smp_debug.entry[ent] = sched_clock();
+}

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC] Fixing CPU Hotplug for RealView Platforms
  2011-01-03 18:03     ` Russell King - ARM Linux
@ 2011-01-04  8:55       ` Vincent Guittot
  0 siblings, 0 replies; 14+ messages in thread
From: Vincent Guittot @ 2011-01-04  8:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 January 2011 19:03, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Jan 03, 2011 at 06:39:56PM +0100, Vincent Guittot wrote:
>> On 3 January 2011 11:46, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Mon, Dec 20, 2010 at 09:16:15AM +0100, Vincent Guittot wrote:
>> >> I'm also interested in hotplug latency measurement and have done some
>> >> on my CA9 platform u8500. I have the same kind of result for plugging
>> >> a secondary cpu:
>> >> ? total duration = 295ms
>> >> ? 166 us for the low level cpu wake up
>> >> ? 228ms between the return from platform_cpu_die and the cpu becomes online
>> >>
>> >> I have added some trace events for doing these measurements and I'd
>> >> like to add some generic traces point in the cpu hotplug code like we
>> >> already have in power management code (cpuidle, suspend, cpufreq ...)
>> >> These traces could be used with power events for studying the impact
>> >> of cpu hotplug in the complete power management scheme.
>> >
>> > Note that if you pass lpj=<number> to the kernel, you'll bypass the
>> > calibration and have a faster response to CPU onlining.
>> >
>>
>> yes, the total duration decreases down to 40ms
>
> I'm not sure that I believe it takes 40ms, as it's only taking about
> 104us to get to the calibration for me. ?The calibration when disabled
> is virtually a do-nothing, so shouldn't be taking 40ms.
>
> The trace code may be hitting locks which are interfering with the
> timing - this below is entirely lockless with a proper (and working)
> sched_clock() implementation.
>
> See mach-vexpress/platsmp.c for the things to add. ?Also note that it
> requires the SMP changes (and for good measure, the clksrc stuff too.)
> Might be easier to apply on top of linux-next.
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 11d6a94..0b3a15a 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -40,6 +40,7 @@
> ?#include <asm/ptrace.h>
> ?#include <asm/localtimer.h>
>
> +#include <asm/smp-debug.h>
> ?/*
> ?* as from 2.5, kernels no longer have an init_tasks structure
> ?* so we need some other way of telling a new secondary core
> @@ -47,6 +48,8 @@
> ?*/
> ?struct secondary_data secondary_data;
>
> +struct smp_debug smp_debug;
> +
> ?enum ipi_msg_type {
> ? ? ? ?IPI_TIMER = 2,
> ? ? ? ?IPI_RESCHEDULE,
> @@ -55,6 +58,24 @@ enum ipi_msg_type {
> ? ? ? ?IPI_CPU_STOP,
> ?};
>
> +static const char *debug_names[] = {
> + ? ? ? [D_START] ? ? ? ? ? ? ? = "Start",
> + ? ? ? [D_BOOT_SECONDARY_CALL] = "Booting",
> + ? ? ? [D_CROSS_CALL] ? ? ? ? ?= "Cross call",
> + ? ? ? [D_PEN_RELEASED] ? ? ? ?= "Pen released",
> + ? ? ? [D_UNLOCK] ? ? ? ? ? ? ?= "Unlock",
> + ? ? ? [D_BOOT_SECONDARY_RET] ?= "Boot returned",
> + ? ? ? [D_BOOT_DONE] ? ? ? ? ? = "Boot complete",
> + ? ? ? [D_SEC_RESTART] ? ? ? ? = "Sec: restart",
> + ? ? ? [D_SEC_UP] ? ? ? ? ? ? ?= "Sec: up",
> + ? ? ? [D_SEC_PLAT_ENTER] ? ? ?= "Sec: enter",
> + ? ? ? [D_SEC_PLAT_PEN_WRITE] ?= "Sec: pen write",
> + ? ? ? [D_SEC_PLAT_PEN_DONE] ? = "Sec: pen done",
> + ? ? ? [D_SEC_PLAT_EXIT] ? ? ? = "Sec: exit",
> + ? ? ? [D_SEC_CALIBRATE] ? ? ? = "Sec: calibrate",
> + ? ? ? [D_SEC_ONLINE] ? ? ? ? ?= "Sec: online",
> +};
> +
> ?int __cpuinit __cpu_up(unsigned int cpu)
> ?{
> ? ? ? ?struct cpuinfo_arm *ci = &per_cpu(cpu_data, cpu);
> @@ -111,7 +132,10 @@ int __cpuinit __cpu_up(unsigned int cpu)
> ? ? ? ?/*
> ? ? ? ? * Now bring the CPU into our world.
> ? ? ? ? */
> + ? ? ? smp_debug_mark(D_START);
> + ? ? ? smp_debug_mark(D_BOOT_SECONDARY_CALL);
> ? ? ? ?ret = boot_secondary(cpu, idle);
> + ? ? ? smp_debug_mark(D_BOOT_SECONDARY_RET);
> ? ? ? ?if (ret == 0) {
> ? ? ? ? ? ? ? ?unsigned long timeout;
>
> @@ -135,6 +159,7 @@ int __cpuinit __cpu_up(unsigned int cpu)
> ? ? ? ?} else {
> ? ? ? ? ? ? ? ?pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
> ? ? ? ?}
> + ? ? ? smp_debug_mark(D_BOOT_DONE);
>
> ? ? ? ?secondary_data.stack = NULL;
> ? ? ? ?secondary_data.pgdir = 0;
> @@ -148,6 +173,14 @@ int __cpuinit __cpu_up(unsigned int cpu)
> ? ? ? ?}
>
> ? ? ? ?pgd_free(&init_mm, pgd);
> +{
> + ? ? ? int i;
> + ? ? ? for (i = 0; i < D_NUM; i++) {
> + ? ? ? ? ? ? ? if (debug_names[i] && smp_debug.entry[i])
> + ? ? ? ? ? ? ? ? ? ? ? pr_info("SMP: %s: %llu\n", debug_names[i],
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? smp_debug.entry[i] - smp_debug.entry[0]);
> + ? ? ? }
> +}
>
> ? ? ? ?return ret;
> ?}
> @@ -245,6 +278,8 @@ void __ref cpu_die(void)
> ? ? ? ? */
> ? ? ? ?platform_cpu_die(cpu);
>
> + ? ? ? smp_debug_mark(D_SEC_RESTART);
> +
> ? ? ? ?/*
> ? ? ? ? * Do not return to the idle loop - jump back to the secondary
> ? ? ? ? * cpu initialisation. ?There's some initialisation which needs
> @@ -278,6 +313,7 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
> ? ? ? ?struct mm_struct *mm = &init_mm;
> ? ? ? ?unsigned int cpu = smp_processor_id();
>
> + ? ? ? smp_debug_mark(D_SEC_UP);
> ? ? ? ?printk("CPU%u: Booted secondary processor\n", cpu);
>
> ? ? ? ?/*
> @@ -312,6 +348,8 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
> ? ? ? ? */
> ? ? ? ?percpu_timer_setup();
>
> + ? ? ? smp_debug_mark(D_SEC_CALIBRATE);
> +
> ? ? ? ?calibrate_delay();
>
> ? ? ? ?smp_store_cpu_info(cpu);
> @@ -320,6 +358,7 @@ asmlinkage void __cpuinit secondary_start_kernel(void)
> ? ? ? ? * OK, now it's safe to let the boot CPU continue
> ? ? ? ? */
> ? ? ? ?set_cpu_online(cpu, true);
> + ? ? ? smp_debug_mark(D_SEC_ONLINE);
>
> ? ? ? ?/*
> ? ? ? ? * OK, it's off to the idle thread for us
> diff --git a/arch/arm/mach-vexpress/platsmp.c b/arch/arm/mach-vexpress/platsmp.c
> index b1687b6..0a77a6b 100644
> --- a/arch/arm/mach-vexpress/platsmp.c
> +++ b/arch/arm/mach-vexpress/platsmp.c
> @@ -27,7 +27,7 @@
> ?#include "core.h"
>
> ?extern void vexpress_secondary_startup(void);
> -
> +#include <asm/smp-debug.h>
> ?/*
> ?* control for which core is the next to come out of the secondary
> ?* boot "holding pen"
> @@ -56,6 +56,7 @@ static DEFINE_SPINLOCK(boot_lock);
>
> ?void __cpuinit platform_secondary_init(unsigned int cpu)
> ?{
> + ? ? ? smp_debug_mark(D_SEC_PLAT_ENTER);
> ? ? ? ?/*
> ? ? ? ? * if any interrupts are already enabled for the primary
> ? ? ? ? * core (e.g. timer irq), then they will not have been enabled
> @@ -67,13 +68,16 @@ void __cpuinit platform_secondary_init(unsigned int cpu)
> ? ? ? ? * let the primary processor know we're out of the
> ? ? ? ? * pen, then head off into the C entry point
> ? ? ? ? */
> + ? ? ? smp_debug_mark(D_SEC_PLAT_PEN_WRITE);
> ? ? ? ?write_pen_release(-1);
> + ? ? ? smp_debug_mark(D_SEC_PLAT_PEN_DONE);
>
> ? ? ? ?/*
> ? ? ? ? * Synchronise with the boot thread.
> ? ? ? ? */
> ? ? ? ?spin_lock(&boot_lock);
> ? ? ? ?spin_unlock(&boot_lock);
> + ? ? ? smp_debug_mark(D_SEC_PLAT_EXIT);
> ?}
>
> ?int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
> @@ -99,6 +103,7 @@ int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
> ? ? ? ? * the boot monitor to read the system wide flags register,
> ? ? ? ? * and branch to the address found there.
> ? ? ? ? */
> + ? ? ? smp_debug_mark(D_CROSS_CALL);
> ? ? ? ?smp_cross_call(cpumask_of(cpu), 1);
>
> ? ? ? ?timeout = jiffies + (1 * HZ);
> @@ -109,12 +114,14 @@ int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
>
> ? ? ? ? ? ? ? ?udelay(10);
> ? ? ? ?}
> + ? ? ? smp_debug_mark(D_PEN_RELEASED);
>
> ? ? ? ?/*
> ? ? ? ? * now the secondary core is starting up let it run its
> ? ? ? ? * calibrations, then wait for it to finish
> ? ? ? ? */
> ? ? ? ?spin_unlock(&boot_lock);
> + ? ? ? smp_debug_mark(D_UNLOCK);
>
> ? ? ? ?return pen_release != -1 ? -ENOSYS : 0;
> ?}
> --- /dev/null ? 2010-08-07 18:16:05.574112050 +0100
> +++ arch/arm/include/asm/smp-debug.h ? ?2010-12-18 22:03:23.622580304 +0000
> @@ -0,0 +1,39 @@
> +#include <linux/sched.h>
> +
> +enum {
> + ? ? ? D_START,
> + ? ? ? D_INIT_IDLE,
> + ? ? ? D_PGD_ALLOC,
> + ? ? ? D_IDMAP_ADD,
> + ? ? ? D_BOOT_SECONDARY_CALL,
> + ? ? ? D_CROSS_CALL,
> + ? ? ? D_PEN_RELEASED,
> + ? ? ? D_UNLOCK,
> + ? ? ? D_BOOT_SECONDARY_RET,
> + ? ? ? D_BOOT_DONE,
> + ? ? ? D_IDMAP_DEL,
> + ? ? ? D_PGD_FREE,
> +
> + ? ? ? D_SEC = 16,
> + ? ? ? D_SEC_RESTART = D_SEC,
> + ? ? ? D_SEC_UP,
> + ? ? ? D_SEC_PLAT_ENTER,
> + ? ? ? D_SEC_PLAT_PEN_WRITE,
> + ? ? ? D_SEC_PLAT_PEN_DONE,
> + ? ? ? D_SEC_PLAT_EXIT,
> + ? ? ? D_SEC_CALIBRATE,
> + ? ? ? D_SEC_ONLINE,
> +
> + ? ? ? D_NUM,
> +};
> +
> +struct smp_debug {
> + ? ? ? unsigned long long ? ? ?entry[D_NUM];
> +};
> +
> +extern struct smp_debug smp_debug;
> +
> +static inline void smp_debug_mark(int ent)
> +{
> + ? ? ? smp_debug.entry[ent] = sched_clock();
> +}
>

In fact, 40ms is the total duration of cpu_up function. The time
between the return of platform_cpu_die and the online of the cpu is
now 93us

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-01-04  8:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-07 16:43 [RFC] Fixing CPU Hotplug for RealView Platforms Will Deacon
2010-12-07 17:18 ` Russell King - ARM Linux
2010-12-07 17:47   ` Will Deacon
2010-12-08  6:03     ` Santosh Shilimkar
2010-12-08 13:20       ` Will Deacon
2010-12-08 20:20     ` Russell King - ARM Linux
2010-12-18 17:10     ` Russell King - ARM Linux
2010-12-18 17:44       ` Will Deacon
2010-12-18 19:22         ` Russell King - ARM Linux
  -- strict thread matches above, loose matches on Subject: below --
2010-12-20  8:16 Vincent Guittot
2011-01-03 10:46 ` Russell King - ARM Linux
2011-01-03 17:39   ` Vincent Guittot
2011-01-03 18:03     ` Russell King - ARM Linux
2011-01-04  8:55       ` Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).