* cpu_suspend does not flush the L2 cache @ 2011-07-25 18:49 Scott Williams 2011-07-25 20:08 ` Russell King - ARM Linux 0 siblings, 1 reply; 7+ messages in thread From: Scott Williams @ 2011-07-25 18:49 UTC (permalink / raw) To: linux-arm-kernel In 2.6.39, CPU suspend/resumes crashes if an outer cache controller (like a PL310) is configured and enabled. cpu_suspend only flushes the L1 cache. If an outer cache controller is enabled, the context and saved stack pointer are left sitting in the L2 memory. An attempt to resume a secondary CPU without shutting down the entire CPU complex and flushing the entire L2 cache will cause the secondary CPU to crash because the stack pointer in sleep_save_sp is invalid in L3 memory. Scott Williams Sr. Software Engineer NVIDIA Corporation -- nvpublic ^ permalink raw reply [flat|nested] 7+ messages in thread
* cpu_suspend does not flush the L2 cache 2011-07-25 18:49 cpu_suspend does not flush the L2 cache Scott Williams @ 2011-07-25 20:08 ` Russell King - ARM Linux 2011-07-25 21:31 ` Will Deacon 2011-07-28 8:15 ` Barry Song 0 siblings, 2 replies; 7+ messages in thread From: Russell King - ARM Linux @ 2011-07-25 20:08 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jul 25, 2011 at 11:49:43AM -0700, Scott Williams wrote: > In 2.6.39, CPU suspend/resumes crashes if an outer cache controller > (like a PL310) is configured and enabled. cpu_suspend only flushes > the L1 cache. Correct. cpu_suspend is been a _consolidation_ effort across the various implementations. Only one implementation deals with the L2 cache issues at present. A bunch of patches have gone in during this merge window to continue that consolidation effort and improve the cpu_suspend interfaces. Eventually the L2 cache issues will be dealt with in core code. So at the moment, platforms are expected to deal with this in their own suspend finisher code. FYI, I have no platforms at present with L2 cache and are capable of suspend. I'm still waiting on TI for some prototype code for OMAP4 suspend support... until that time, I am unable to progress it further unless I try to address these issues blind. ^ permalink raw reply [flat|nested] 7+ messages in thread
* cpu_suspend does not flush the L2 cache 2011-07-25 20:08 ` Russell King - ARM Linux @ 2011-07-25 21:31 ` Will Deacon 2011-07-28 8:15 ` Barry Song 1 sibling, 0 replies; 7+ messages in thread From: Will Deacon @ 2011-07-25 21:31 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jul 25, 2011 at 09:08:20PM +0100, Russell King - ARM Linux wrote: > On Mon, Jul 25, 2011 at 11:49:43AM -0700, Scott Williams wrote: > > In 2.6.39, CPU suspend/resumes crashes if an outer cache controller > > (like a PL310) is configured and enabled. cpu_suspend only flushes > > the L1 cache. > > Correct. cpu_suspend is been a _consolidation_ effort across the various > implementations. Only one implementation deals with the L2 cache issues > at present. > > A bunch of patches have gone in during this merge window to continue > that consolidation effort and improve the cpu_suspend interfaces. > Eventually the L2 cache issues will be dealt with in core code. I seem to have the outer L2 stuff working pretty well in my latest kexec series (my vexpress happily ran init=/kexec.sh all weekend). That should hopefully provide all the necessary hooks in the core code. If not, we should work something out. I'll post what I have after the merge window (SMP still not quite there yet). Will ^ permalink raw reply [flat|nested] 7+ messages in thread
* cpu_suspend does not flush the L2 cache 2011-07-25 20:08 ` Russell King - ARM Linux 2011-07-25 21:31 ` Will Deacon @ 2011-07-28 8:15 ` Barry Song 2011-07-28 9:57 ` Santosh Shilimkar 1 sibling, 1 reply; 7+ messages in thread From: Barry Song @ 2011-07-28 8:15 UTC (permalink / raw) To: linux-arm-kernel 2011/7/26 Russell King - ARM Linux <linux@arm.linux.org.uk>: > On Mon, Jul 25, 2011 at 11:49:43AM -0700, Scott Williams wrote: >> In 2.6.39, CPU suspend/resumes crashes if an outer cache controller >> (like a PL310) is configured and enabled. cpu_suspend only flushes >> the L1 cache. > > Correct. ?cpu_suspend is been a _consolidation_ effort across the various > implementations. ?Only one implementation deals with the L2 cache issues > at present. > > A bunch of patches have gone in during this merge window to continue > that consolidation effort and improve the cpu_suspend interfaces. > Eventually the L2 cache issues will be dealt with in core code. > > So at the moment, platforms are expected to deal with this in their own > suspend finisher code. So one possible way is that platforms clean and flush L2 cache while suspending, then disable L2. After resuming from wake-up entry, platforms reinitilized L2 by some hardware setting and l2x_init. > > FYI, I have no platforms at present with L2 cache and are capable of > suspend. ?I'm still waiting on TI for some prototype code for OMAP4 > suspend support... until that time, I am unable to progress it further > unless I try to address these issues blind. On SiRFprimaII, we have tried the suspend/resume when L2 is on. i'd like to give a platform example. Finally, L2 cache suspend/resume can be in core code. > -barry ^ permalink raw reply [flat|nested] 7+ messages in thread
* cpu_suspend does not flush the L2 cache 2011-07-28 8:15 ` Barry Song @ 2011-07-28 9:57 ` Santosh Shilimkar 2011-07-28 17:14 ` Scott Williams 0 siblings, 1 reply; 7+ messages in thread From: Santosh Shilimkar @ 2011-07-28 9:57 UTC (permalink / raw) To: linux-arm-kernel On 7/28/2011 1:45 PM, Barry Song wrote: > 2011/7/26 Russell King - ARM Linux<linux@arm.linux.org.uk>: >> On Mon, Jul 25, 2011 at 11:49:43AM -0700, Scott Williams wrote: >>> In 2.6.39, CPU suspend/resumes crashes if an outer cache controller >>> (like a PL310) is configured and enabled. cpu_suspend only flushes >>> the L1 cache. >> >> Correct. cpu_suspend is been a _consolidation_ effort across the various >> implementations. Only one implementation deals with the L2 cache issues >> at present. >> >> A bunch of patches have gone in during this merge window to continue >> that consolidation effort and improve the cpu_suspend interfaces. >> Eventually the L2 cache issues will be dealt with in core code. >> >> So at the moment, platforms are expected to deal with this in their own >> suspend finisher code. > > So one possible way is that platforms clean and flush L2 cache while > suspending, then disable L2. > After resuming from wake-up entry, platforms reinitilized L2 by some > hardware setting and l2x_init. > Flushing is not going to address other scenario's with L2. There are issues even when only CPU lost it's context and while re-enabling MMU on it in power up sequence, L2 creates an issue. >> >> FYI, I have no platforms at present with L2 cache and are capable of >> suspend. I'm still waiting on TI for some prototype code for OMAP4 >> suspend support... until that time, I am unable to progress it further >> unless I try to address these issues blind. > Hopefully we can sort out this issue considering Russell has the OMAP4 PM code to experiment now. > On SiRFprimaII, we have tried the suspend/resume when L2 is on. i'd > like to give a platform example. > Finally, L2 cache suspend/resume can be in core code. >> Flushing L2 isn't solution for the case where L2 memory is retained but Logic is lost. You might use such states in CPUIDLE. For suspend though this will work because you always try to go to deepest possible low power state and in that case. Regards Santosh ^ permalink raw reply [flat|nested] 7+ messages in thread
* cpu_suspend does not flush the L2 cache 2011-07-28 9:57 ` Santosh Shilimkar @ 2011-07-28 17:14 ` Scott Williams 2011-07-28 18:10 ` Lorenzo Pieralisi 0 siblings, 1 reply; 7+ messages in thread From: Scott Williams @ 2011-07-28 17:14 UTC (permalink / raw) To: linux-arm-kernel In the CPU idle case where only one CPU is shutting down, disabling the L2 cache is not an option. I've done experiments cleaning only the lines containing the CPU context (failed after < 100 cpu_suspend cycles) and cleaning the entire L2 cache (failed after ~36K cycles) with additional flushes of the L1 data cache before exiting coherency. The system eventually panics because of an invalid PMD. Initial analysis points to spin lock failure. This only solution I've found so far is to disable the L2 cache entirely (has so far survived >120K cycles). -- nvpublic -----Original Message----- From: Santosh Shilimkar [mailto:santosh.shilimkar at ti.com] Sent: Thursday, July 28, 2011 2:57 AM To: Barry Song Cc: Russell King - ARM Linux; Rongjun Ying; Scott Williams; yuping.luo; linux-arm-kernel at lists.infradead.org; Dan Willemsen Subject: Re: cpu_suspend does not flush the L2 cache On 7/28/2011 1:45 PM, Barry Song wrote: > 2011/7/26 Russell King - ARM Linux<linux@arm.linux.org.uk>: >> On Mon, Jul 25, 2011 at 11:49:43AM -0700, Scott Williams wrote: >>> In 2.6.39, CPU suspend/resumes crashes if an outer cache controller >>> (like a PL310) is configured and enabled. cpu_suspend only flushes >>> the L1 cache. >> >> Correct. cpu_suspend is been a _consolidation_ effort across the various >> implementations. Only one implementation deals with the L2 cache issues >> at present. >> >> A bunch of patches have gone in during this merge window to continue >> that consolidation effort and improve the cpu_suspend interfaces. >> Eventually the L2 cache issues will be dealt with in core code. >> >> So at the moment, platforms are expected to deal with this in their own >> suspend finisher code. > > So one possible way is that platforms clean and flush L2 cache while > suspending, then disable L2. > After resuming from wake-up entry, platforms reinitilized L2 by some > hardware setting and l2x_init. > Flushing is not going to address other scenario's with L2. There are issues even when only CPU lost it's context and while re-enabling MMU on it in power up sequence, L2 creates an issue. >> >> FYI, I have no platforms at present with L2 cache and are capable of >> suspend. I'm still waiting on TI for some prototype code for OMAP4 >> suspend support... until that time, I am unable to progress it further >> unless I try to address these issues blind. > Hopefully we can sort out this issue considering Russell has the OMAP4 PM code to experiment now. > On SiRFprimaII, we have tried the suspend/resume when L2 is on. i'd > like to give a platform example. > Finally, L2 cache suspend/resume can be in core code. >> Flushing L2 isn't solution for the case where L2 memory is retained but Logic is lost. You might use such states in CPUIDLE. For suspend though this will work because you always try to go to deepest possible low power state and in that case. Regards Santosh ^ permalink raw reply [flat|nested] 7+ messages in thread
* cpu_suspend does not flush the L2 cache 2011-07-28 17:14 ` Scott Williams @ 2011-07-28 18:10 ` Lorenzo Pieralisi 0 siblings, 0 replies; 7+ messages in thread From: Lorenzo Pieralisi @ 2011-07-28 18:10 UTC (permalink / raw) To: linux-arm-kernel On Thu, Jul 28, 2011 at 06:14:00PM +0100, Scott Williams wrote: > In the CPU idle case where only one CPU is shutting down, disabling the L2 cache is not an option. I've done experiments cleaning only the lines containing the CPU context (failed after < 100 cpu_suspend cycles) and cleaning the entire L2 cache (failed after ~36K cycles) with additional flushes of the L1 data cache before exiting coherency. The system eventually panics because of an invalid PMD. Initial analysis points to spin lock failure. This only solution I've found so far is to disable the L2 cache entirely (has so far survived >120K cycles). > Scott, in the cpu idle, single cpu shutdown case the procedure to follow is this one: - disable d-cache in SCTRL - clean/invalidate d-cache the above in a single function to avoid pulling cache lines from other CPU(s) (e.g stack, thread_info). - exit coherency At this point in time cacheable spinlocks are not usable anymore. If you do use cpu_suspend, you have still to flush the stack to L3. You should do that with functions which do not use cacheable spinlocks; so basically that code becomes racy. Or you use outer cache functions and the system stops working since that code takes spinlocks on cacheable memory and they are gone. I am using non-cacheable memory to save the context and the procedure above works perfectly fine, I do not have to clean lines from L2. But I am abusing the current cpu_suspend implementation, since I reverted to using cpu_do_suspend which is not a kernel API, I agree with Russell. On CPU wake up, when MMU is off code should not write any data that might be in L2. I am using a temporary non-cacheable stack for that, before MMU is enabled. > -----Original Message----- > From: Santosh Shilimkar [mailto:santosh.shilimkar at ti.com] > Sent: Thursday, July 28, 2011 2:57 AM > To: Barry Song > Cc: Russell King - ARM Linux; Rongjun Ying; Scott Williams; yuping.luo; linux-arm-kernel at lists.infradead.org; Dan Willemsen > Subject: Re: cpu_suspend does not flush the L2 cache > > On 7/28/2011 1:45 PM, Barry Song wrote: > > 2011/7/26 Russell King - ARM Linux<linux@arm.linux.org.uk>: > >> On Mon, Jul 25, 2011 at 11:49:43AM -0700, Scott Williams wrote: > >>> In 2.6.39, CPU suspend/resumes crashes if an outer cache controller > >>> (like a PL310) is configured and enabled. cpu_suspend only flushes > >>> the L1 cache. > >> > >> Correct. cpu_suspend is been a _consolidation_ effort across the various > >> implementations. Only one implementation deals with the L2 cache issues > >> at present. > >> > >> A bunch of patches have gone in during this merge window to continue > >> that consolidation effort and improve the cpu_suspend interfaces. > >> Eventually the L2 cache issues will be dealt with in core code. > >> > >> So at the moment, platforms are expected to deal with this in their own > >> suspend finisher code. > > > > So one possible way is that platforms clean and flush L2 cache while > > suspending, then disable L2. > > After resuming from wake-up entry, platforms reinitilized L2 by some > > hardware setting and l2x_init. > > > Flushing is not going to address other scenario's with L2. There are > issues even when only CPU lost it's context and while re-enabling > MMU on it in power up sequence, L2 creates an issue. > > >> > >> FYI, I have no platforms at present with L2 cache and are capable of > >> suspend. I'm still waiting on TI for some prototype code for OMAP4 > >> suspend support... until that time, I am unable to progress it further > >> unless I try to address these issues blind. > > > Hopefully we can sort out this issue considering Russell has the > OMAP4 PM code to experiment now. > > > On SiRFprimaII, we have tried the suspend/resume when L2 is on. i'd > > like to give a platform example. > > Finally, L2 cache suspend/resume can be in core code. > >> > Flushing L2 isn't solution for the case where L2 memory is > retained but Logic is lost. You might use such states in > CPUIDLE. > For suspend though this will work because you always try > to go to deepest possible low power state and in that > case. > > Regards > Santosh > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-07-28 18:10 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-07-25 18:49 cpu_suspend does not flush the L2 cache Scott Williams 2011-07-25 20:08 ` Russell King - ARM Linux 2011-07-25 21:31 ` Will Deacon 2011-07-28 8:15 ` Barry Song 2011-07-28 9:57 ` Santosh Shilimkar 2011-07-28 17:14 ` Scott Williams 2011-07-28 18:10 ` Lorenzo Pieralisi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).