* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes
@ 2012-05-14 7:03 Murali N
2012-05-14 15:50 ` Lorenzo Pieralisi
0 siblings, 1 reply; 26+ messages in thread
From: Murali N @ 2012-05-14 7:03 UTC (permalink / raw)
To: linux-arm-kernel
Hi All,
I have a query on cache flush sequence being followed for L1 & L2
while target going into deep low power state on CortexA5 MPCore.
Here are the H/W details & the cache flush sequence i am following in
my power driver:
H/W details:
1.?????? ?APPS processor: CortexA5 MPCore
2.?????? ?L2 controller: External PL310 r3p2
Sequences:
a)?While target is going into deep low power mode (where APPS
processor + L2 loose their power) currently I am following the below
cache flush sequence.
1.?L2 cache clean & invalidate
2.?L2 disable
3.?L1 clean & invalidate
4.?L1 disable
5.?WFI
b)?But when I look the PL310 r3p2 TRM (page no 91) explains the
sequence to be followed is bit difference than what I am following.
1.?L1 clean & invalidate
2.?L1 disable
3.?L2 cache clean & invalidate
4.?L2 disable
5.?WFI
Is it mandatory that I would follow only the sequence that is
mentioned in the TRM (i.e. b)? (OR) though TRM says above sequence
(i.e. b) can i still follow the steps (i.e. a)?
What are problems that I see, if I don?t follow what TRM says & follow
the sequence which I have mentioned above (i.e. a)?
Also I have worked on another target with CortexA5 (Single core with
same L2 pl310 controller) where i have followed the sequence ?a? for
quite a long time and don?t see any data corruption issues.
Here my question is, is the above sequence ?b? something special for
only CortexA5MPCore targets to follow?
>From the system stability wise I don?t see any improvement after I
moved to a sequence mentioned in the TRM (i.e. b) for CortexA5 MPCore
target.
Please provide your valuable inputs if you guys have seen similar
issues on other targets?
--
Regards,
Murali N
^ permalink raw reply [flat|nested] 26+ messages in thread* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 7:03 L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes Murali N @ 2012-05-14 15:50 ` Lorenzo Pieralisi 2012-05-14 15:58 ` Russell King - ARM Linux 0 siblings, 1 reply; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-05-14 15:50 UTC (permalink / raw) To: linux-arm-kernel Hi, On Mon, May 14, 2012 at 08:03:04AM +0100, Murali N wrote: > Hi All, > I have a query on cache flush sequence being followed for L1 & L2 > while target going into deep low power state on CortexA5 MPCore. > Here are the H/W details & the cache flush sequence i am following in > my power driver: > > H/W details: > 1. APPS processor: CortexA5 MPCore > 2. L2 controller: External PL310 r3p2 > > Sequences: > a) While target is going into deep low power mode (where APPS > processor + L2 loose their power) currently I am following the below > cache flush sequence. > > 1. L2 cache clean & invalidate This is wrong. If L1 evictions happen here you will kiss those cache lines goodbye when the cluster is powered off. See below. > 2. L2 disable > 3. L1 clean & invalidate This is wrong again since while cleaning and invalidating the cache (L1 here) can still allocate and this must not happen. > 4. L1 disable > 5. WFI > > b) But when I look the PL310 r3p2 TRM (page no 91) explains the > sequence to be followed is bit difference than what I am following. > > 1. L1 clean & invalidate > 2. L1 disable > 3. L2 cache clean & invalidate > 4. L2 disable > 5. WFI You are *extrapolating* the procedure above from the TRM, but that's not 100% correct. For a single CPU shutdown the procedure is the following: 1) clear C bit in SCTLR. CPU won't allocate cache lines in integrated (L1 for A5) caches anymore. Memory access might still hit in the cache, but that's not a problem, you just want to writeback the content of caches to DDR on power down. This is subtle but important. If a dirty cache line is moved from one processor to the one going down while cleaning the cache, the cache line is lost (dirty lines can be moved between processors). Clearing the C bit BEFORE starting the cache clean prevents that. 2) clean and invalidate the cache levels (L1 in A5) 3) exit coherency (clear SMP bit in ACTLR) If the cluster has to be shut down as well and L2 is not retained through power down: 4) clean and invalidate L2 5) disable PL310 Please note that 5 might not be strictly required, it depends on your specific HW configuration and how AXI transactions interact with the power controller. If you want to be on the safe side, (5) has to be executed. Please note that PL310 can be disabled before cleaning and invalidating L2. If you carry out the operations in the order above, code must NOT write any static data that has to be preserved throughout shutdown between (4) and (5). The C bit in SCTLR does not affect PL310 since it is external to the core so you could end up allocating cache lines after the entire content of L2 has been cleaned. If those lines are just eg stack lines that can be discarded then fine, but if that data is to be preserved and consistent through shut down then you have been warned. I suggest you have a look at OMAP4 CPU idle implementation where the above is implemented in detail, inclusive of cpu_{suspend}/cpu_{resume} API that provides the infrastructure on top of which cache management code must be built. > Is it mandatory that I would follow only the sequence that is > mentioned in the TRM (i.e. b)? (OR) though TRM says above sequence > (i.e. b) can i still follow the steps (i.e. a)? > What are problems that I see, if I don't follow what TRM says & follow > the sequence which I have mentioned above (i.e. a)? Yes, it is mandatory. I hope I explained why thoroughly. And (b), as it stands in your description is wrong and I explained why it is so. > Also I have worked on another target with CortexA5 (Single core with > same L2 pl310 controller) where i have followed the sequence 'a' for > quite a long time and don't see any data corruption issues. This does not mean the procedure is correct. > Here my question is, is the above sequence 'b' something special for > only CortexA5MPCore targets to follow? (b) is wrong, and the "patched" procedure I provided you with works for all ARM MP systems (and consequently UP as well). Hope that helps, feel free to come back to us for any questions. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 15:50 ` Lorenzo Pieralisi @ 2012-05-14 15:58 ` Russell King - ARM Linux 2012-05-14 16:21 ` Lorenzo Pieralisi 0 siblings, 1 reply; 26+ messages in thread From: Russell King - ARM Linux @ 2012-05-14 15:58 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 14, 2012 at 04:50:22PM +0100, Lorenzo Pieralisi wrote: > > 2. L2 disable > > 3. L1 clean & invalidate > > This is wrong again since while cleaning and invalidating the cache (L1 here) > can still allocate and this must not happen. No it isn't. There is never anything wrong with allocating new caches lines into a cache which is going to (eventually) be powered down. Ever. What would be wrong is if we end up with dirty cache lines in the cache to be powered down for data which we _care_ about preserving when power is lost. That's a _very_ _very_ important difference. Sure, if we're talking about avoiding cache snooping etc, then we may wish to disable coherency, but, again, there's absolutely nothing wrong with allocating cache lines. Take a moment to think why this is. Where's the data pulled into the cache stored - in RAM. The copy in the cache, while it remains clean, is just a duplicate of what's already stored elsewhere in the system. ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 15:58 ` Russell King - ARM Linux @ 2012-05-14 16:21 ` Lorenzo Pieralisi 2012-05-14 16:39 ` Russell King - ARM Linux 2013-12-24 17:52 ` Antti Miettinen 0 siblings, 2 replies; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-05-14 16:21 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 14, 2012 at 04:58:59PM +0100, Russell King - ARM Linux wrote: > On Mon, May 14, 2012 at 04:50:22PM +0100, Lorenzo Pieralisi wrote: > > > 2. L2 disable > > > 3. L1 clean & invalidate > > > > This is wrong again since while cleaning and invalidating the cache (L1 here) > > can still allocate and this must not happen. > > No it isn't. There is never anything wrong with allocating new caches lines > into a cache which is going to (eventually) be powered down. Ever. What if the cache allocates a dirty cache line moved from L1 of another processor ? > What would be wrong is if we end up with dirty cache lines in the cache > to be powered down for data which we _care_ about preserving when power > is lost. > > That's a _very_ _very_ important difference. That's exactly the point I am making. dirty cache lines can be migrated across processors caches. If we want to shut down a single core we have to be 100% sure that dirty cache lines (if we care about that data, we might be not as you pointed out) must not be present in L1 when we shut the core down. If the C bit in SCTLR is not cleared before cleaning and invalidating this is not guaranteed from an architectural point of view. Occurences might be rare, but it is still not safe to clean the cache with the C bit set. > Sure, if we're talking about avoiding cache snooping etc, then we may > wish to disable coherency, but, again, there's absolutely nothing wrong > with allocating cache lines. > > Take a moment to think why this is. Where's the data pulled into the > cache stored - in RAM. The copy in the cache, while it remains clean, > is just a duplicate of what's already stored elsewhere in the system. It can be stored in other caches RAM too on MP systems. While it remains clean, fine. It is dirty cache lines migration I am talking about. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 16:21 ` Lorenzo Pieralisi @ 2012-05-14 16:39 ` Russell King - ARM Linux 2012-05-14 17:15 ` Lorenzo Pieralisi 2013-12-24 17:52 ` Antti Miettinen 1 sibling, 1 reply; 26+ messages in thread From: Russell King - ARM Linux @ 2012-05-14 16:39 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 14, 2012 at 05:21:50PM +0100, Lorenzo Pieralisi wrote: > On Mon, May 14, 2012 at 04:58:59PM +0100, Russell King - ARM Linux wrote: > > On Mon, May 14, 2012 at 04:50:22PM +0100, Lorenzo Pieralisi wrote: > > > > 2. L2 disable > > > > 3. L1 clean & invalidate > > > > > > This is wrong again since while cleaning and invalidating the cache (L1 here) > > > can still allocate and this must not happen. > > > > No it isn't. There is never anything wrong with allocating new caches lines > > into a cache which is going to (eventually) be powered down. Ever. > > What if the cache allocates a dirty cache line moved from L1 of another > processor ? > > > What would be wrong is if we end up with dirty cache lines in the cache > > to be powered down for data which we _care_ about preserving when power > > is lost. > > > > That's a _very_ _very_ important difference. > > That's exactly the point I am making. dirty cache lines can be migrated across > processors caches. If we want to shut down a single core we have to be 100% > sure that dirty cache lines (if we care about that data, we might be not as you > pointed out) must not be present in L1 when we shut the core down. If the C > bit in SCTLR is not cleared before cleaning and invalidating this is not > guaranteed from an architectural point of view. > > Occurences might be rare, but it is still not safe to clean the cache with the > C bit set. It's not safe to disable the C bit without first pushing the dirty data out to RAM either. It's a catch-22 situation - because turning the C bit off not only stops the caches allocating new lines but also prevents them being searched. That means your view of cacheable memory suddenly changes beneath you when the C bit is turned off. >From what you're saying - and from my understanding of your cache behaviours, even the sequence: - clean cache - disable C bit - clean cache is buggy. I think what you're effectively saying is that it is not possible to safely power down a cache on an ARM SMP CPU... ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 16:39 ` Russell King - ARM Linux @ 2012-05-14 17:15 ` Lorenzo Pieralisi 2012-05-15 9:25 ` Murali N 2012-05-15 9:40 ` Russell King - ARM Linux 0 siblings, 2 replies; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-05-14 17:15 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > On Mon, May 14, 2012 at 05:21:50PM +0100, Lorenzo Pieralisi wrote: > > On Mon, May 14, 2012 at 04:58:59PM +0100, Russell King - ARM Linux wrote: > > > On Mon, May 14, 2012 at 04:50:22PM +0100, Lorenzo Pieralisi wrote: > > > > > 2. L2 disable > > > > > 3. L1 clean & invalidate > > > > > > > > This is wrong again since while cleaning and invalidating the cache (L1 here) > > > > can still allocate and this must not happen. > > > > > > No it isn't. There is never anything wrong with allocating new caches lines > > > into a cache which is going to (eventually) be powered down. Ever. > > > > What if the cache allocates a dirty cache line moved from L1 of another > > processor ? > > > > > What would be wrong is if we end up with dirty cache lines in the cache > > > to be powered down for data which we _care_ about preserving when power > > > is lost. > > > > > > That's a _very_ _very_ important difference. > > > > That's exactly the point I am making. dirty cache lines can be migrated across > > processors caches. If we want to shut down a single core we have to be 100% > > sure that dirty cache lines (if we care about that data, we might be not as you > > pointed out) must not be present in L1 when we shut the core down. If the C > > bit in SCTLR is not cleared before cleaning and invalidating this is not > > guaranteed from an architectural point of view. > > > > Occurences might be rare, but it is still not safe to clean the cache with the > > C bit set. > > It's not safe to disable the C bit without first pushing the dirty data out > to RAM either. It's a catch-22 situation - because turning the C bit off > not only stops the caches allocating new lines but also prevents them being > searched. That depends on the processor. On A9 cache is bypassed on A15 it is not, data access might still hit in the cache. It is "implementation defined" according to ARM ARM (B2-1265). But C bit cleared stops allocation that's true across all implementations. > That means your view of cacheable memory suddenly changes beneath you when > the C bit is turned off. Yes might be (see above) but the cache operations still work so we do not have any problem (well, as long as we clean and invalidate without using data that can live in the cache, but that's how it is done on v7 cache flush ops and it is perfectly fine). > From what you're saying - and from my understanding of your cache behaviours, > even the sequence: > - clean cache > - disable C bit > - clean cache > is buggy. No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > > I think what you're effectively saying is that it is not possible to safely > power down a cache on an ARM SMP CPU... It is possible, but the final clean must be done with C bit cleared. It is belt and braces, agreed, but that's the only way to do it properly. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 17:15 ` Lorenzo Pieralisi @ 2012-05-15 9:25 ` Murali N 2012-05-15 9:40 ` Russell King - ARM Linux 1 sibling, 0 replies; 26+ messages in thread From: Murali N @ 2012-05-15 9:25 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 14, 2012 at 10:45 PM, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: >> On Mon, May 14, 2012 at 05:21:50PM +0100, Lorenzo Pieralisi wrote: >> > On Mon, May 14, 2012 at 04:58:59PM +0100, Russell King - ARM Linux wrote: >> > > On Mon, May 14, 2012 at 04:50:22PM +0100, Lorenzo Pieralisi wrote: >> > > > > 2. L2 disable >> > > > > 3. L1 clean & invalidate >> > > > >> > > > This is wrong again since while cleaning and invalidating the cache (L1 here) >> > > > can still allocate and this must not happen. >> > > >> > > No it isn't. ?There is never anything wrong with allocating new caches lines >> > > into a cache which is going to (eventually) be powered down. ?Ever. >> > >> > What if the cache allocates a dirty cache line moved from L1 of another >> > processor ? >> > >> > > What would be wrong is if we end up with dirty cache lines in the cache >> > > to be powered down for data which we _care_ about preserving when power >> > > is lost. >> > > >> > > That's a _very_ _very_ important difference. >> > >> > That's exactly the point I am making. dirty cache lines can be migrated across >> > processors caches. If we want to shut down a single core we have to be 100% >> > sure that dirty cache lines (if we care about that data, we might be not as you >> > pointed out) must not be present in L1 when we shut the core down. If the C >> > bit in SCTLR is not cleared before cleaning and invalidating this is not >> > guaranteed from an architectural point of view. >> > >> > Occurences might be rare, but it is still not safe to clean the cache with the >> > C bit set. >> >> It's not safe to disable the C bit without first pushing the dirty data out >> to RAM either. ?It's a catch-22 situation - because turning the C bit off >> not only stops the caches allocating new lines but also prevents them being >> searched. > > That depends on the processor. On A9 cache is bypassed on A15 it is not, > data access might still hit in the cache. > > It is "implementation defined" according to ARM ARM (B2-1265). > But C bit cleared stops allocation that's true across all implementations. > >> That means your view of cacheable memory suddenly changes beneath you when >> the C bit is turned off. > > Yes might be (see above) but the cache operations still work so we do > not have any problem (well, as long as we clean and invalidate without > using data that can live in the cache, but that's how it is done on v7 cache > flush ops and it is perfectly fine). > >> From what you're saying - and from my understanding of your cache behaviours, >> even the sequence: >> - clean cache >> - disable C bit >> - clean cache >> is buggy. > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > >> >> I think what you're effectively saying is that it is not possible to safely >> power down a cache on an ARM SMP CPU... > > It is possible, but the final clean must be done with C bit cleared. It is > belt and braces, agreed, but that's the only way to do it properly. > > Lorenzo > In my case while powering down the core0, core1 is already in *off* state and out of coherency. So, while shutting down the CPU0 (cpu1 is already off) still i need to follow the steps you have mentioned for effective power down of a cache? My feel is while going to shutdown effectively i operate in a single core mode, so it doesn't make any difference in following either of the sequence? Please correct me if i am wrong. -- Regards, Murali N ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 17:15 ` Lorenzo Pieralisi 2012-05-15 9:25 ` Murali N @ 2012-05-15 9:40 ` Russell King - ARM Linux 2012-05-15 10:09 ` Lorenzo Pieralisi 1 sibling, 1 reply; 26+ messages in thread From: Russell King - ARM Linux @ 2012-05-15 9:40 UTC (permalink / raw) To: linux-arm-kernel On Mon, May 14, 2012 at 06:15:33PM +0100, Lorenzo Pieralisi wrote: > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > > From what you're saying - and from my understanding of your cache behaviours, > > even the sequence: > > - clean cache > > - disable C bit > > - clean cache > > is buggy. > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. It's racy. Consider this: - clean cache - cache speculatively prefetches a dirty cache line from another CPU - disable C bit At this point, you lose access to that dirty data. If that dirty data is used inbetween disabling the C bit and cleaning the cache for the second time, you have data corruption issues. Another point which needs to be checked is whether dirty cache lines in a CPUs cache which has had the C bit disabled still take part in the coherency protocol with other CPUs. If the answer is no, then that's a _major_ problem for the hot unplug code paths. That effectively means that we have a window where a CPU going down actively _corrupts_ the data visible to other CPUs. As I have said, given what you've mentioned, it is impossible to safely disable the cache on a SMP system. In order to do it safely, you need to have a way to disable new allocations into the cache _without_ disabling the ability for the cache to be searched. And if we could do that, then the sequence becomes a simple and race free: - disable new allocations - clean cache - disable cache ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 9:40 ` Russell King - ARM Linux @ 2012-05-15 10:09 ` Lorenzo Pieralisi 2012-05-15 10:15 ` Russell King - ARM Linux 0 siblings, 1 reply; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-05-15 10:09 UTC (permalink / raw) To: linux-arm-kernel On Tue, May 15, 2012 at 10:40:10AM +0100, Russell King - ARM Linux wrote: > On Mon, May 14, 2012 at 06:15:33PM +0100, Lorenzo Pieralisi wrote: > > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > > > From what you're saying - and from my understanding of your cache behaviours, > > > even the sequence: > > > - clean cache > > > - disable C bit > > > - clean cache > > > is buggy. > > > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > > It's racy. Consider this: > > - clean cache > - cache speculatively prefetches a dirty cache line from another CPU > - disable C bit - clean cache > At this point, you lose access to that dirty data. If that dirty data is > used inbetween disabling the C bit and cleaning the cache for the second > time, you have data corruption issues. It is not racy. After disabling the C bit the cache clean operations write-back any dirty cache line to the next cache level. And the CPU is still in coherency mode so there is not a problem with that either. > Another point which needs to be checked is whether dirty cache lines in > a CPUs cache which has had the C bit disabled still take part in the > coherency protocol with other CPUs. If the answer is no, then that's a > _major_ problem for the hot unplug code paths. That effectively means > that we have a window where a CPU going down actively _corrupts_ the > data visible to other CPUs. See above. > As I have said, given what you've mentioned, it is impossible to safely > disable the cache on a SMP system. In order to do it safely, you need to > have a way to disable new allocations into the cache _without_ disabling > the ability for the cache to be searched. Cache lines can be acted upon with maintenance operations whether the C bit is set or clear. For instance caches can be invalidated when the MMU is off and the C bit is clear, eg v7 boot. Cache cleaning and cache enabling/disabling are two different things, that's valid for the PL310 as well. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 10:09 ` Lorenzo Pieralisi @ 2012-05-15 10:15 ` Russell King - ARM Linux 2012-05-15 16:28 ` Lorenzo Pieralisi 0 siblings, 1 reply; 26+ messages in thread From: Russell King - ARM Linux @ 2012-05-15 10:15 UTC (permalink / raw) To: linux-arm-kernel On Tue, May 15, 2012 at 11:09:02AM +0100, Lorenzo Pieralisi wrote: > On Tue, May 15, 2012 at 10:40:10AM +0100, Russell King - ARM Linux wrote: > > On Mon, May 14, 2012 at 06:15:33PM +0100, Lorenzo Pieralisi wrote: > > > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > > > > From what you're saying - and from my understanding of your cache behaviours, > > > > even the sequence: > > > > - clean cache > > > > - disable C bit > > > > - clean cache > > > > is buggy. > > > > > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > > > > It's racy. Consider this: > > > > - clean cache > > - cache speculatively prefetches a dirty cache line from another CPU > > - disable C bit > - clean cache Thank you for totally missing the point and destroying the example. > > At this point, you lose access to that dirty data. If that dirty data is > > used inbetween disabling the C bit and cleaning the cache for the second > > time, you have data corruption issues. > > It is not racy. After disabling the C bit the cache clean operations write-back > any dirty cache line to the next cache level. And the CPU is still in coherency > mode so there is not a problem with that either. No. *THINK* about the exact example I gave you. Think about what state the CPU sees between that "disable C bit" and the final cache clean (which you seem to be insisting is an atomic operation.) Please, read what I'm saying rather than re-interpreting it, augmenting it and then answering something entirely different. > > As I have said, given what you've mentioned, it is impossible to safely > > disable the cache on a SMP system. In order to do it safely, you need to > > have a way to disable new allocations into the cache _without_ disabling > > the ability for the cache to be searched. > > Cache lines can be acted upon with maintenance operations whether the C bit is > set or clear. For instance caches can be invalidated when the MMU is off > and the C bit is clear, eg v7 boot. > > Cache cleaning and cache enabling/disabling are two different things, that's > valid for the PL310 as well. Yes, of course I realise that. That's not what I'm talking about at all. ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 10:15 ` Russell King - ARM Linux @ 2012-05-15 16:28 ` Lorenzo Pieralisi 2012-05-15 16:36 ` Russell King - ARM Linux 0 siblings, 1 reply; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-05-15 16:28 UTC (permalink / raw) To: linux-arm-kernel On Tue, May 15, 2012 at 11:15:05AM +0100, Russell King - ARM Linux wrote: > On Tue, May 15, 2012 at 11:09:02AM +0100, Lorenzo Pieralisi wrote: > > On Tue, May 15, 2012 at 10:40:10AM +0100, Russell King - ARM Linux wrote: > > > On Mon, May 14, 2012 at 06:15:33PM +0100, Lorenzo Pieralisi wrote: > > > > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > > > > > From what you're saying - and from my understanding of your cache behaviours, > > > > > even the sequence: > > > > > - clean cache > > > > > - disable C bit > > > > > - clean cache > > > > > is buggy. > > > > > > > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > > > > > > It's racy. Consider this: > > > > > > - clean cache > > > - cache speculatively prefetches a dirty cache line from another CPU > > > - disable C bit > > - clean cache > > Thank you for totally missing the point and destroying the example. > > > > At this point, you lose access to that dirty data. If that dirty data is > > > used inbetween disabling the C bit and cleaning the cache for the second > > > time, you have data corruption issues. > > > > It is not racy. After disabling the C bit the cache clean operations write-back > > any dirty cache line to the next cache level. And the CPU is still in coherency > > mode so there is not a problem with that either. > > No. *THINK* about the exact example I gave you. Think about what state > the CPU sees between that "disable C bit" and the final cache clean (which > you seem to be insisting is an atomic operation.) > > Please, read what I'm saying rather than re-interpreting it, augmenting it > and then answering something entirely different. > > > > As I have said, given what you've mentioned, it is impossible to safely > > > disable the cache on a SMP system. In order to do it safely, you need to > > > have a way to disable new allocations into the cache _without_ disabling > > > the ability for the cache to be searched. First off, my apologies, it was not meant to disrupt the discussion, if I did sorry about that. Let's try to sum it up: 1) Hitting in the cache when the SCTLR.C is cleared is CPU specific (eg A9 does not, A15 does) 2) as long as they are taking part in coherency (SMP bit set in ACTLR), all Cortex-A cores in a MP configuration with the SCTLR.C bit set can hit in the cache of a CPU that runs with the C bit cleared in SCTLR 3) The sequence: - clean cache - clear SCTLR.C - clean cache is correct and we must mandate it, with the following remarks: - The first cache clean is superfluos (but does no harm) - The second cache clean must not rely on any data that might sit in the cache - clearing SCTLR.C and cleaning the cache must be coded in assembly in a function carrying out both operations (to avoid stack issues ie cacheable push/pop ops and any global data reference) 4) Current vexpress hotplug code clears ACTLR.SMP bit before clearing SCTLR.C; this is a bug according to this discussion and we must fix it (to avoid copy'n'paste of code that does not follow the standard for platforms that have PM capabilities beyond standbywfi) Please let me know if I am missing something and thanks for the discussion. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 16:28 ` Lorenzo Pieralisi @ 2012-05-15 16:36 ` Russell King - ARM Linux 2012-05-15 17:05 ` Lorenzo Pieralisi 2012-05-15 18:17 ` Will Deacon 0 siblings, 2 replies; 26+ messages in thread From: Russell King - ARM Linux @ 2012-05-15 16:36 UTC (permalink / raw) To: linux-arm-kernel On Tue, May 15, 2012 at 05:28:51PM +0100, Lorenzo Pieralisi wrote: > On Tue, May 15, 2012 at 11:15:05AM +0100, Russell King - ARM Linux wrote: > > On Tue, May 15, 2012 at 11:09:02AM +0100, Lorenzo Pieralisi wrote: > > > On Tue, May 15, 2012 at 10:40:10AM +0100, Russell King - ARM Linux wrote: > > > > On Mon, May 14, 2012 at 06:15:33PM +0100, Lorenzo Pieralisi wrote: > > > > > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > > > > > > From what you're saying - and from my understanding of your cache behaviours, > > > > > > even the sequence: > > > > > > - clean cache > > > > > > - disable C bit > > > > > > - clean cache > > > > > > is buggy. > > > > > > > > > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > > > > > > > > It's racy. Consider this: > > > > > > > > - clean cache > > > > - cache speculatively prefetches a dirty cache line from another CPU > > > > - disable C bit > > > - clean cache > > > > Thank you for totally missing the point and destroying the example. > > > > > > At this point, you lose access to that dirty data. If that dirty data is > > > > used inbetween disabling the C bit and cleaning the cache for the second > > > > time, you have data corruption issues. > > > > > > It is not racy. After disabling the C bit the cache clean operations write-back > > > any dirty cache line to the next cache level. And the CPU is still in coherency > > > mode so there is not a problem with that either. > > > > No. *THINK* about the exact example I gave you. Think about what state > > the CPU sees between that "disable C bit" and the final cache clean (which > > you seem to be insisting is an atomic operation.) > > > > Please, read what I'm saying rather than re-interpreting it, augmenting it > > and then answering something entirely different. > > > > > > As I have said, given what you've mentioned, it is impossible to safely > > > > disable the cache on a SMP system. In order to do it safely, you need to > > > > have a way to disable new allocations into the cache _without_ disabling > > > > the ability for the cache to be searched. > > First off, my apologies, it was not meant to disrupt the discussion, if > I did sorry about that. Let's try to sum it up: > > 1) Hitting in the cache when the SCTLR.C is cleared is CPU specific (eg A9 > does not, A15 does) > 2) as long as they are taking part in coherency (SMP bit set in ACTLR), all > Cortex-A cores in a MP configuration with the SCTLR.C bit set can hit in > the cache of a CPU that runs with the C bit cleared in SCTLR > 3) The sequence: > - clean cache > - clear SCTLR.C > - clean cache > > is correct I continue to disagree with that assertion. I repeat: what happens in this situation on A9: - clean cache - cache speculatively prefetches data from another core - clear SCTLR.C - _this_ core accesses the address associated with that prefetched data _That_ is a data corruption issue - as soon as SCTLR.C is cleared, the CPUs view of data in memory _changes_, and is only restored to what it should be when the dirty cache lines are finally flushed out of the cache. And then, hey presto, the data magically changes again. So, the above sequence is _not_ safe, and it's _not_ "correct". It _might_ be the closest thing you can get to given the broken hardware design, but calling this _correct_ is a silly thing to do when it contains such a problem. ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 16:36 ` Russell King - ARM Linux @ 2012-05-15 17:05 ` Lorenzo Pieralisi 2012-09-19 8:55 ` Antti P Miettinen 2012-05-15 18:17 ` Will Deacon 1 sibling, 1 reply; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-05-15 17:05 UTC (permalink / raw) To: linux-arm-kernel On Tue, May 15, 2012 at 05:36:18PM +0100, Russell King - ARM Linux wrote: > On Tue, May 15, 2012 at 05:28:51PM +0100, Lorenzo Pieralisi wrote: > > On Tue, May 15, 2012 at 11:15:05AM +0100, Russell King - ARM Linux wrote: > > > On Tue, May 15, 2012 at 11:09:02AM +0100, Lorenzo Pieralisi wrote: > > > > On Tue, May 15, 2012 at 10:40:10AM +0100, Russell King - ARM Linux wrote: > > > > > On Mon, May 14, 2012 at 06:15:33PM +0100, Lorenzo Pieralisi wrote: > > > > > > On Mon, May 14, 2012 at 05:39:09PM +0100, Russell King - ARM Linux wrote: > > > > > > > From what you're saying - and from my understanding of your cache behaviours, > > > > > > > even the sequence: > > > > > > > - clean cache > > > > > > > - disable C bit > > > > > > > - clean cache > > > > > > > is buggy. > > > > > > > > > > > > No, that's correct, works fine on A9 and A15. Second clean is mostly nops. > > > > > > > > > > It's racy. Consider this: > > > > > > > > > > - clean cache > > > > > - cache speculatively prefetches a dirty cache line from another CPU > > > > > - disable C bit > > > > - clean cache > > > > > > Thank you for totally missing the point and destroying the example. > > > > > > > > At this point, you lose access to that dirty data. If that dirty data is > > > > > used inbetween disabling the C bit and cleaning the cache for the second > > > > > time, you have data corruption issues. > > > > > > > > It is not racy. After disabling the C bit the cache clean operations write-back > > > > any dirty cache line to the next cache level. And the CPU is still in coherency > > > > mode so there is not a problem with that either. > > > > > > No. *THINK* about the exact example I gave you. Think about what state > > > the CPU sees between that "disable C bit" and the final cache clean (which > > > you seem to be insisting is an atomic operation.) > > > > > > Please, read what I'm saying rather than re-interpreting it, augmenting it > > > and then answering something entirely different. > > > > > > > > As I have said, given what you've mentioned, it is impossible to safely > > > > > disable the cache on a SMP system. In order to do it safely, you need to > > > > > have a way to disable new allocations into the cache _without_ disabling > > > > > the ability for the cache to be searched. > > > > First off, my apologies, it was not meant to disrupt the discussion, if > > I did sorry about that. Let's try to sum it up: > > > > 1) Hitting in the cache when the SCTLR.C is cleared is CPU specific (eg A9 > > does not, A15 does) > > 2) as long as they are taking part in coherency (SMP bit set in ACTLR), all > > Cortex-A cores in a MP configuration with the SCTLR.C bit set can hit in > > the cache of a CPU that runs with the C bit cleared in SCTLR > > 3) The sequence: > > - clean cache > > - clear SCTLR.C > > - clean cache > > > > is correct > > I continue to disagree with that assertion. > > I repeat: what happens in this situation on A9: > > - clean cache > - cache speculatively prefetches data from another core > - clear SCTLR.C > - _this_ core accesses the address associated with that prefetched > data > > _That_ is a data corruption issue - as soon as SCTLR.C is cleared, the CPUs > view of data in memory _changes_, and is only restored to what it should > be when the dirty cache lines are finally flushed out of the cache. And > then, hey presto, the data magically changes again. What you are saying is correct, no doubts about that; I think though that in this controlled code execution code path for power down, explicit data access after clearing the C bit but before cleaning the cache must and can be prevented. What we should do as I described, is executing the sequence: clear SCTRL.C clean cache exit coherency in an uninterruptible way (it is always executed with IRQs disabled) and with no explicit access to any data whatsoever. If we code that in assembly (and lots of us already did that for v7, eg OMAP4) in a controlled code path, I think we can call it safe, that's my opinion FWIW. Thanks, Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 17:05 ` Lorenzo Pieralisi @ 2012-09-19 8:55 ` Antti P Miettinen 2012-09-20 9:54 ` Lorenzo Pieralisi 0 siblings, 1 reply; 26+ messages in thread From: Antti P Miettinen @ 2012-09-19 8:55 UTC (permalink / raw) To: linux-arm-kernel Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes: > What we should do as I described, is executing the sequence: > > clear SCTRL.C > clean cache > exit coherency How does SCTRL.C affect TLB fetches? Especially on A9? Seems that page table updates do clean_dcache_area() so probably not an issue but just out of curiosity, are TLB fetches affected by the C bit on A9? -- Antti P Miettinen http://www.iki.fi/~ananaza/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-09-19 8:55 ` Antti P Miettinen @ 2012-09-20 9:54 ` Lorenzo Pieralisi 2012-09-20 21:17 ` Antti P Miettinen 0 siblings, 1 reply; 26+ messages in thread From: Lorenzo Pieralisi @ 2012-09-20 9:54 UTC (permalink / raw) To: linux-arm-kernel On Wed, Sep 19, 2012 at 09:55:52AM +0100, Antti P Miettinen wrote: > Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes: > > What we should do as I described, is executing the sequence: > > > > clear SCTRL.C > > clean cache > > exit coherency > > How does SCTRL.C affect TLB fetches? Especially on A9? Seems that page > table updates do clean_dcache_area() so probably not an issue but just > out of curiosity, are TLB fetches affected by the C bit on A9? Yes, they are. TLB fetches cannot search the D-cache if the C bit in SCTLR is clear on A9. I do not see any issue with this though, at least in the power down procedure described above and in previous e-mails in this thread. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-09-20 9:54 ` Lorenzo Pieralisi @ 2012-09-20 21:17 ` Antti P Miettinen 2012-09-23 21:32 ` Antti P Miettinen 0 siblings, 1 reply; 26+ messages in thread From: Antti P Miettinen @ 2012-09-20 21:17 UTC (permalink / raw) To: linux-arm-kernel Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes: > On Wed, Sep 19, 2012 at 09:55:52AM +0100, Antti P Miettinen wrote: >> Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes: >> > What we should do as I described, is executing the sequence: >> > >> > clear SCTRL.C >> > clean cache >> > exit coherency >> >> How does SCTRL.C affect TLB fetches? Especially on A9? Seems that page >> table updates do clean_dcache_area() so probably not an issue but just >> out of curiosity, are TLB fetches affected by the C bit on A9? > > Yes, they are. TLB fetches cannot search the D-cache if the C bit in > SCTLR is clear on A9. I do not see any issue with this though, at least > in the power down procedure described above and in previous e-mails in > this thread. > > Lorenzo Hmm.. is the condition for cache coherence protocol then different from TLB lookups? If C is cleared, is the cache available for snoops by other cores? What happens if another core needs a dirty line in a cache that has C cleared? -- Antti P Miettinen http://www.iki.fi/~ananaza/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-09-20 21:17 ` Antti P Miettinen @ 2012-09-23 21:32 ` Antti P Miettinen 2013-02-22 9:04 ` Antti P Miettinen 0 siblings, 1 reply; 26+ messages in thread From: Antti P Miettinen @ 2012-09-23 21:32 UTC (permalink / raw) To: linux-arm-kernel Antti P Miettinen <ananaza@iki.fi> writes: > Hmm.. is the condition for cache coherence protocol then different from > TLB lookups? If C is cleared, is the cache available for snoops by other > cores? What happens if another core needs a dirty line in a cache that > has C cleared? Sorry - looks like you already answered this: > 2) as long as they are taking part in coherency (SMP bit set in ACTLR), all > Cortex-A cores in a MP configuration with the SCTLR.C bit set can hit in > the cache of a CPU that runs with the C bit cleared in SCTLR So other cores apparently can search the cache that has C bit cleared. The only clarification I still would need is whether this searching applies to also TLB fetches by other cores. So when you say: > .. TLB fetches cannot search the D-cache if the C bit in > SCTLR is clear on A9. .. you meant TLB fethes by the core that has it's C bit cleared. The TLB fetches by other cores will still search the cache just like any other coherence searches? -- Antti P Miettinen http://www.iki.fi/~ananaza/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-09-23 21:32 ` Antti P Miettinen @ 2013-02-22 9:04 ` Antti P Miettinen 2013-02-22 9:39 ` Lorenzo Pieralisi 0 siblings, 1 reply; 26+ messages in thread From: Antti P Miettinen @ 2013-02-22 9:04 UTC (permalink / raw) To: linux-arm-kernel Hello, coming back to an old thread: Antti P Miettinen <ananaza@iki.fi> writes: > Antti P Miettinen <ananaza@iki.fi> writes: >> Hmm.. is the condition for cache coherence protocol then different from >> TLB lookups? If C is cleared, is the cache available for snoops by other >> cores? What happens if another core needs a dirty line in a cache that >> has C cleared? > > Sorry - looks like you already answered this: >> 2) as long as they are taking part in coherency (SMP bit set in ACTLR), all >> Cortex-A cores in a MP configuration with the SCTLR.C bit set can hit in >> the cache of a CPU that runs with the C bit cleared in SCTLR > > So other cores apparently can search the cache that has C bit > cleared. The only clarification I still would need is whether this > searching applies to also TLB fetches by other cores. So when you say: >> .. TLB fetches cannot search the D-cache if the C bit in >> SCTLR is clear on A9. .. > > you meant TLB fethes by the core that has it's C bit cleared. The TLB > fetches by other cores will still search the cache just like any other > coherence searches? This did not get answered - are TLB fetches by sibling cores treated in the same way as cache fetches? If core A has C bit cleared, is the cache still available for TLB fetches by core B? -- Antti P Miettinen http://www.iki.fi/~ananaza/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2013-02-22 9:04 ` Antti P Miettinen @ 2013-02-22 9:39 ` Lorenzo Pieralisi 2013-02-23 20:41 ` Antti P Miettinen 0 siblings, 1 reply; 26+ messages in thread From: Lorenzo Pieralisi @ 2013-02-22 9:39 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 22, 2013 at 09:04:04AM +0000, Antti P Miettinen wrote: > Hello, coming back to an old thread: > > Antti P Miettinen <ananaza@iki.fi> writes: > > Antti P Miettinen <ananaza@iki.fi> writes: > >> Hmm.. is the condition for cache coherence protocol then different from > >> TLB lookups? If C is cleared, is the cache available for snoops by other > >> cores? What happens if another core needs a dirty line in a cache that > >> has C cleared? > > > > Sorry - looks like you already answered this: > >> 2) as long as they are taking part in coherency (SMP bit set in ACTLR), all > >> Cortex-A cores in a MP configuration with the SCTLR.C bit set can hit in > >> the cache of a CPU that runs with the C bit cleared in SCTLR > > > > So other cores apparently can search the cache that has C bit > > cleared. The only clarification I still would need is whether this > > searching applies to also TLB fetches by other cores. So when you say: > >> .. TLB fetches cannot search the D-cache if the C bit in > >> SCTLR is clear on A9. .. > > > > you meant TLB fethes by the core that has it's C bit cleared. The TLB > > fetches by other cores will still search the cache just like any other > > coherence searches? > > This did not get answered - are TLB fetches by sibling cores treated in > the same way as cache fetches? If core A has C bit cleared, is the cache > still available for TLB fetches by core B? Yes, it is as long as the SMP bit is set in ACTLR. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2013-02-22 9:39 ` Lorenzo Pieralisi @ 2013-02-23 20:41 ` Antti P Miettinen 2013-02-25 13:36 ` Lorenzo Pieralisi 0 siblings, 1 reply; 26+ messages in thread From: Antti P Miettinen @ 2013-02-23 20:41 UTC (permalink / raw) To: linux-arm-kernel From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > On Fri, Feb 22, 2013 at 09:04:04AM +0000, Antti P Miettinen wrote: >> This did not get answered - are TLB fetches by sibling cores treated in >> the same way as cache fetches? If core A has C bit cleared, is the cache >> still available for TLB fetches by core B? > > Yes, it is as long as the SMP bit is set in ACTLR. > > Lorenzo Thanks Lorenzo. Do you know if there are any known errata that would invalidate any of the assumptions disussed in this thread? -- Antti P Miettinen http://www.iki.fi/~ananaza/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2013-02-23 20:41 ` Antti P Miettinen @ 2013-02-25 13:36 ` Lorenzo Pieralisi 0 siblings, 0 replies; 26+ messages in thread From: Lorenzo Pieralisi @ 2013-02-25 13:36 UTC (permalink / raw) To: linux-arm-kernel On Sat, Feb 23, 2013 at 08:41:17PM +0000, Antti P Miettinen wrote: > From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > On Fri, Feb 22, 2013 at 09:04:04AM +0000, Antti P Miettinen wrote: > >> This did not get answered - are TLB fetches by sibling cores treated in > >> the same way as cache fetches? If core A has C bit cleared, is the cache > >> still available for TLB fetches by core B? > > > > Yes, it is as long as the SMP bit is set in ACTLR. > > > > Lorenzo > > Thanks Lorenzo. Do you know if there are any known errata that would > invalidate any of the assumptions disussed in this thread? If you can provide me with a bit of context I am happy to help you chase ths issue(s), since it looks like you are facing some. Thanks, Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 16:36 ` Russell King - ARM Linux 2012-05-15 17:05 ` Lorenzo Pieralisi @ 2012-05-15 18:17 ` Will Deacon 2012-05-17 5:01 ` Murali N 1 sibling, 1 reply; 26+ messages in thread From: Will Deacon @ 2012-05-15 18:17 UTC (permalink / raw) To: linux-arm-kernel Hi Russell, On Tue, May 15, 2012 at 05:36:18PM +0100, Russell King - ARM Linux wrote: > I repeat: what happens in this situation on A9: > > - clean cache > - cache speculatively prefetches data from another core If this prefetching occurs then either: (a) The line is clean (no problem) (b) Another core has written some data and we end up (speculatively) loading dirty lines Case (b) is only a problem if we actually commit to using the data later on. > - clear SCTLR.C > - _this_ core accesses the address associated with that prefetched > data Yes. At this point it is cpu-specific whether or not we hit our dirty lines from above. On A9, we will get the stale data from memory. However, this is exactly the same situation we would find ourselves in if we tried to access dirty data held in any cache with our SCTLR.C bit cleared. We're no longer coherent at this stage, so need to avoid accessing shared data. > _That_ is a data corruption issue - as soon as SCTLR.C is cleared, the CPUs > view of data in memory _changes_, and is only restored to what it should > be when the dirty cache lines are finally flushed out of the cache. And > then, hey presto, the data magically changes again. Well we still can't see dirty data in any of the other L1 caches, so our view of memory is going to be constantly out of date. The tricky bit is ensuring that we don't rely on data being written by anybody else (and if we write data ourself, we need to make sure it's suitably aligned so as not to get clobbered by evictions from the other caches). Will ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-15 18:17 ` Will Deacon @ 2012-05-17 5:01 ` Murali N 2012-05-17 7:30 ` Shilimkar, Santosh 0 siblings, 1 reply; 26+ messages in thread From: Murali N @ 2012-05-17 5:01 UTC (permalink / raw) To: linux-arm-kernel On Tue, May 15, 2012 at 11:47 PM, Will Deacon <will.deacon@arm.com> wrote: > Hi Russell, > > On Tue, May 15, 2012 at 05:36:18PM +0100, Russell King - ARM Linux wrote: >> I repeat: what happens in this situation on A9: >> >> ? ? ? - clean cache >> ? ? ? - cache speculatively prefetches data from another core > > If this prefetching occurs then either: > > ? ? ? ?(a) The line is clean (no problem) > > ? ? ? ?(b) Another core has written some data and we end up (speculatively) > ? ? ? ? ? ?loading dirty lines > > Case (b) is only a problem if we actually commit to using the data later on. > >> ? ? ? - clear SCTLR.C >> ? ? ? - _this_ core accesses the address associated with that prefetched >> ? ? ? ? data > > Yes. At this point it is cpu-specific whether or not we hit our dirty lines > from above. On A9, we will get the stale data from memory. However, this is > exactly the same situation we would find ourselves in if we tried to access > dirty data held in any cache with our SCTLR.C bit cleared. We're no longer > coherent at this stage, so need to avoid accessing shared data. > >> _That_ is a data corruption issue - as soon as SCTLR.C is cleared, the CPUs >> view of data in memory _changes_, and is only restored to what it should >> be when the dirty cache lines are finally flushed out of the cache. ?And >> then, hey presto, the data magically changes again. > > Well we still can't see dirty data in any of the other L1 caches, so our view > of memory is going to be constantly out of date. The tricky bit is ensuring > that we don't rely on data being written by anybody else (and if we write > data ourself, we need to make sure it's suitably aligned so as not to get > clobbered by evictions from the other caches). > > Will how about following the below sequence still cause any possible problems? 1. L1 clean & invalidate 2. L2 clean & invalidate 3. Disable L2 4. L1 clean & invalidate 5. Disable "C" bit 6. WFI -- Regards, Murali N ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-17 5:01 ` Murali N @ 2012-05-17 7:30 ` Shilimkar, Santosh 0 siblings, 0 replies; 26+ messages in thread From: Shilimkar, Santosh @ 2012-05-17 7:30 UTC (permalink / raw) To: linux-arm-kernel On Thu, May 17, 2012 at 10:31 AM, Murali N <nalajala.murali@gmail.com> wrote: > On Tue, May 15, 2012 at 11:47 PM, Will Deacon <will.deacon@arm.com> wrote: >> Hi Russell, >> >> On Tue, May 15, 2012 at 05:36:18PM +0100, Russell King - ARM Linux wrote: >>> I repeat: what happens in this situation on A9: >>> >>> ? ? ? - clean cache >>> ? ? ? - cache speculatively prefetches data from another core >> >> If this prefetching occurs then either: >> >> ? ? ? ?(a) The line is clean (no problem) >> >> ? ? ? ?(b) Another core has written some data and we end up (speculatively) >> ? ? ? ? ? ?loading dirty lines >> >> Case (b) is only a problem if we actually commit to using the data later on. >> >>> ? ? ? - clear SCTLR.C >>> ? ? ? - _this_ core accesses the address associated with that prefetched >>> ? ? ? ? data >> >> Yes. At this point it is cpu-specific whether or not we hit our dirty lines >> from above. On A9, we will get the stale data from memory. However, this is >> exactly the same situation we would find ourselves in if we tried to access >> dirty data held in any cache with our SCTLR.C bit cleared. We're no longer >> coherent at this stage, so need to avoid accessing shared data. >> >>> _That_ is a data corruption issue - as soon as SCTLR.C is cleared, the CPUs >>> view of data in memory _changes_, and is only restored to what it should >>> be when the dirty cache lines are finally flushed out of the cache. ?And >>> then, hey presto, the data magically changes again. >> >> Well we still can't see dirty data in any of the other L1 caches, so our view >> of memory is going to be constantly out of date. The tricky bit is ensuring >> that we don't rely on data being written by anybody else (and if we write >> data ourself, we need to make sure it's suitably aligned so as not to get >> clobbered by evictions from the other caches). >> >> Will > > how about following the below sequence still cause any possible problems? > > 1. L1 clean & invalidate > 2. L2 clean & invalidate > 3. Disable L2 > 4. L1 clean & invalidate > 5. Disable "C" bit > 6. WFI > This is wrong if the code path is common for CPU and CPU cluster power down. As Russell pointed out the corner cases, the sequence I got working without any issues so far on OMAP is like below ... - L1 clean & invalidate - Disable "C" bit - ISB - L1 clean & invalidate - Disable SMP bit - ISB - Check for cluster state if cluster == OFF - L2 clean & invalidate isb dsb WFI NOP ( To avoid speculative aborts if any) NOP NOP NOP No. of NOPS depends on the pipeline depth. Hope it helps Regards Santosh ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2012-05-14 16:21 ` Lorenzo Pieralisi 2012-05-14 16:39 ` Russell King - ARM Linux @ 2013-12-24 17:52 ` Antti Miettinen 2014-01-06 12:43 ` Lorenzo Pieralisi 1 sibling, 1 reply; 26+ messages in thread From: Antti Miettinen @ 2013-12-24 17:52 UTC (permalink / raw) To: linux-arm-kernel Sorry to still bring up an old thread, but this still bothers me.. Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes: > [..] dirty cache lines can be migrated across > processors caches. [..] What are the conditions under which this can happen? Which CPUs in reality migrate dirty lines between caches? And C==0 does prevent migrations as well as local allocations? --Antti ^ permalink raw reply [flat|nested] 26+ messages in thread
* L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes 2013-12-24 17:52 ` Antti Miettinen @ 2014-01-06 12:43 ` Lorenzo Pieralisi 0 siblings, 0 replies; 26+ messages in thread From: Lorenzo Pieralisi @ 2014-01-06 12:43 UTC (permalink / raw) To: linux-arm-kernel On Tue, Dec 24, 2013 at 05:52:48PM +0000, Antti Miettinen wrote: > Sorry to still bring up an old thread, but this still bothers me.. > > Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes: > > [..] dirty cache lines can be migrated across > > processors caches. [..] > > What are the conditions under which this can happen? Which CPUs in > reality migrate dirty lines between caches? And C==0 does prevent > migrations as well as local allocations? It happens if a cache miss is for a cache line that is dirty on another CPU that is part of the coherency domain, the line is just moved from one L1 to the local L1. Yes, C bit cleared prevents allocations so it prevents migrations too. Lorenzo ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2014-01-06 12:43 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-14 7:03 L1 & L2 cache flush sequence on CortexA5 MPcore w.r.t low power modes Murali N 2012-05-14 15:50 ` Lorenzo Pieralisi 2012-05-14 15:58 ` Russell King - ARM Linux 2012-05-14 16:21 ` Lorenzo Pieralisi 2012-05-14 16:39 ` Russell King - ARM Linux 2012-05-14 17:15 ` Lorenzo Pieralisi 2012-05-15 9:25 ` Murali N 2012-05-15 9:40 ` Russell King - ARM Linux 2012-05-15 10:09 ` Lorenzo Pieralisi 2012-05-15 10:15 ` Russell King - ARM Linux 2012-05-15 16:28 ` Lorenzo Pieralisi 2012-05-15 16:36 ` Russell King - ARM Linux 2012-05-15 17:05 ` Lorenzo Pieralisi 2012-09-19 8:55 ` Antti P Miettinen 2012-09-20 9:54 ` Lorenzo Pieralisi 2012-09-20 21:17 ` Antti P Miettinen 2012-09-23 21:32 ` Antti P Miettinen 2013-02-22 9:04 ` Antti P Miettinen 2013-02-22 9:39 ` Lorenzo Pieralisi 2013-02-23 20:41 ` Antti P Miettinen 2013-02-25 13:36 ` Lorenzo Pieralisi 2012-05-15 18:17 ` Will Deacon 2012-05-17 5:01 ` Murali N 2012-05-17 7:30 ` Shilimkar, Santosh 2013-12-24 17:52 ` Antti Miettinen 2014-01-06 12:43 ` Lorenzo Pieralisi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).