linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Fixing CPU Hotplug for RealView Platforms
@ 2010-12-07 16:43 Will Deacon
  2010-12-07 17:18 ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2010-12-07 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Currently, CPU hotplug is broken for RealView platforms. I posted some
patches previously to try and address this, but they didn't solve the
problems fully:

http://lists.infradead.org/pipermail/linux-arm-kernel/2010-September/026157.html

I'm now revisiting the code and it looks like the main problem is when
we wish to *leave* the lowpower state. The enter/leave routines look
like this:


static inline void cpu_enter_lowpower(void)
{
	unsigned int v, smp_ctrl = get_smp_ctrl_mask();

	flush_cache_all();
	dsb();
	asm volatile(
	/*
	 * Turn off coherency
	 */
	"	mrc	p15, 0, %0, c1, c0, 1\n"
	"	bic	%0, %0, %1\n"
	"	mcr	p15, 0, %0, c1, c0, 1\n"
	/* ISB */
	"	mcr	p15, 0, %2, c7, c5, 4\n"
	/* Disable D-cache */
	"	mrc	p15, 0, %0, c1, c0, 0\n"
	"	bic	%0, %0, #0x04\n"
	"	mcr	p15, 0, %0, c1, c0, 0\n"
	  : "=&r" (v)
	  : "r" (smp_ctrl), "r" (0)
	  : "memory");
	isb();
}

static inline void cpu_leave_lowpower(void)
{
	unsigned int v, smp_ctrl = get_smp_ctrl_mask();

	asm volatile(	"mrc	p15, 0, %0, c1, c0, 0\n"
	"	orr	%0, %0, #0x04\n"
	"	mcr	p15, 0, %0, c1, c0, 0\n"
	"	mrc	p15, 0, %0, c1, c0, 1\n"
	"	orr	%0, %0, %1\n"
	"	mcr	p15, 0, %0, c1, c0, 1\n"
	  : "=&r" (v)
	  : "r" (smp_ctrl)
	  : "memory");
	isb();
}


The problem is that by turning off coherency, the contents of the D-cache
becomes stale. If data is prefetched into L1 between the flush_cache_all
invocation and disabling the D-cache then this data will still be present
when we come out of lowpower. Without coherency, we *must not* use this
data and so a D-cache invalidation to the PoC is required in cpu_leave_lowpower().

On v6 this is a simple mcr instruction. On v7, we have to perform a set/way
operation across all ways of each cache until we reach the PoC (see the
scary but well commented v7_flush_dcache_all function). Implementing this
means extending the cpu_cache_fns struct and stubbing out the new function
for other caches, so I'd like to see if anybody has any better ideas before
I go ahead and make these changes.

One possibility is not to turn off coherency, but if platform_do_lowpower
is more than a WFI I don't think this would be suitable.

Any thoughts?

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread
* [RFC] Fixing CPU Hotplug for RealView Platforms
@ 2010-12-20  8:16 Vincent Guittot
  2011-01-03 10:46 ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2010-12-20  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

I'm also interested in hotplug latency measurement and have done some
on my CA9 platform u8500. I have the same kind of result for plugging
a secondary cpu:
  total duration = 295ms
  166 us for the low level cpu wake up
  228ms between the return from platform_cpu_die and the cpu becomes online

I have added some trace events for doing these measurements and I'd
like to add some generic traces point in the cpu hotplug code like we
already have in power management code (cpuidle, suspend, cpufreq ...)
These traces could be used with power events for studying the impact
of cpu hotplug in the complete power management scheme.



> Message: 4
> Date: Sat, 18 Dec 2010 19:22:13 +0000
> From: Russell King - ARM Linux <linux@arm.linux.org.uk>
> To: Will Deacon <will.deacon@arm.com>
> Cc: linux-arm-kernel at lists.infradead.org
> Subject: Re: [RFC] Fixing CPU Hotplug for RealView Platforms
> Message-ID: <20101218192213.GL9937@n2100.arm.linux.org.uk>
> Content-Type: text/plain; charset=us-ascii
>
> On Sat, Dec 18, 2010 at 05:44:47PM +0000, Will Deacon wrote:
>> > Hotplug bringup:
>> >
>> > Booting: 1000 ? ? ? ? ? ? ? ? ? -> 0ns ? ? ? ? ?0ns ? ? ? ? ? ? (1us per print)
>> > Restarting: 3976375 ? ? ? ? ? ? -> ? ? ? ? ? ? ?3.976375ms
>> > cross call: 3976625 ? ? ? ? ? ? -> 3.976625ms
>> > Up: 4003125 ? ? ? ? ? ? ? ? ? ? -> ? ? ? ? ? ? ?4.003125ms
>> > CPU1: Booted secondary processor
>> > secondary_init: 4022583 ? ? ? ? -> ? ? ? ? ? ? ?4.022583ms
>> > writing release: 4040750 ? ? ? ?-> ? ? ? ? ? ? ?4.04075ms
>> > release done: 4051083 ? ? ? ? ? -> ? ? ? ? ? ? ?4.051083ms
>> > released: 46509000 ? ? ? ? ? ? ?-> 4.6509ms
>> > Boot returned: 51745708 ? ? ? ? -> 5.1745708ms
>> > sync'd: 51745875 ? ? ? ? ? ? ? ?-> ? ? ? ? ? ? ?5.1745875ms
>> > CPU1: Unknown IPI message 0x1
>> > Switched to NOHz mode on CPU #1
>> > Online: 281251041 ? ? ? ? ? ? ? -> ? ? ? ? ? ? ?281.251041ms
>> >
>> > So, it appears to take 4ms to get from just before the call to
>> > boot_secondary() in __cpu_up() to writing pen_release.
>> >
>> > The secondary CPU appears to run from being woken up to writing the
>> > pen release in about 40us - and then spends about 1ms spinning on
>> > its lock waiting for the requesting CPU to catch up.
>> >
>> > This can be repeated every time without exception when you bring a
>> > CPU back online.
>> >
>> Hmm, this sounds needlessly expensive.
>
> Actually, I'm starting to get concerned about doing timing measurements
> on Versatile Express - I'm seeing some unexplainable issues with the
> Versatile Express platform.
>
> I occasionally see the kernel get stuck when initializing the CLCD - and
> I think this is a hardware lockup - pressing the red 'reset/power on'
> button is ignored, and the only way to recover it is to press the
> black 'power off' button first.
>
> Also I keep running into some weird stuff which causes the MMC to
> underflow, serial output to be corrupted, and rootfs not to be mounted
> which is 100% reliable with some kernels (iow, the built kernel just
> will not boot no matter how many times you attempt to do so.) ?I've
> sent Catalin & Philippe a copy of one such kernel which exhibits this
> behaviour a few days ago (but I think they're on holiday.)
>
> Anyway, I decided to implement a slightly different method to measuring
> the time taken, and the apparant long delays have gone - I suspect that
> was something to do with printk. ?I'm not logging the times into an
> array, and later printing out the values.
>
> So, CPU1 boot:
>
> SMP: Start: 0
> SMP: Booting: 916
> SMP: Cross call: 3083
> SMP: Pen released: 278416
> SMP: Unlock: 279583
> SMP: Boot returned: 280333
>
> SMP: Sec: up: 238666
> SMP: Sec: enter: 264333
> SMP: Sec: pen write: 267083
> SMP: Sec: pen done: 268916
> SMP: Sec: exit: 279916
> SMP: Sec: calibrate: 328416
> SMP: Sec: online: 218380875
>
> CPU1 hotplug:
> SMP: Start: 0
> SMP: Booting: 833
> SMP: Cross call: 4250
> SMP: Pen released: 51500
> SMP: Unlock: 52667
> SMP: Boot returned: 53500
>
> SMP: Sec: restart: 4667
> SMP: Sec: up: 7167
> SMP: Sec: enter: 31000
> SMP: Sec: pen write: 39667
> SMP: Sec: pen done: 42167
> SMP: Sec: exit: 53000
> SMP: Sec: calibrate: 104583
> SMP: Sec: online: 221423333
>
> This looks far saner.
>
> Anyway, with the delay loop calibration, we're looking at a boot time of
> about 110us to the delay loop calibration, and 221ms for a secondary CPU
> using the existing code. ?I don't think that will go up significantly if
> we re-vector offlined CPUs back through the reset vector.
>
>
>
> ------------------------------

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-01-04  8:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-07 16:43 [RFC] Fixing CPU Hotplug for RealView Platforms Will Deacon
2010-12-07 17:18 ` Russell King - ARM Linux
2010-12-07 17:47   ` Will Deacon
2010-12-08  6:03     ` Santosh Shilimkar
2010-12-08 13:20       ` Will Deacon
2010-12-08 20:20     ` Russell King - ARM Linux
2010-12-18 17:10     ` Russell King - ARM Linux
2010-12-18 17:44       ` Will Deacon
2010-12-18 19:22         ` Russell King - ARM Linux
  -- strict thread matches above, loose matches on Subject: below --
2010-12-20  8:16 Vincent Guittot
2011-01-03 10:46 ` Russell King - ARM Linux
2011-01-03 17:39   ` Vincent Guittot
2011-01-03 18:03     ` Russell King - ARM Linux
2011-01-04  8:55       ` Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).