* [RFC PATCH] ARM: Fix the bug in dcache flush all API.
@ 2011-12-16 9:02 sricharan
2011-12-16 10:38 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: sricharan @ 2011-12-16 9:02 UTC (permalink / raw)
To: linux-arm-kernel
Currently the v7_flush_dcache_all api uses the loc bit field
in clidr register to find out the last level of cache that has to be
flushed to achieve the data coherency. The loc bit field is present in
the register from bits[24:26], but the algorithm uses bits[23:25] as
the loc bit field. Bit[23] which corresponds to the cache type bit of
level 8 is zero in most of the architectures. As a example, for a
level 3 coherency the loc bit field is actually 2, since the bits[23:25]
are used, the loc becomes 4 (multipled by 2) and algorithm compares this
with current cace level * 2, which makes it work. But this wont work starting
from cases where the loc bit field becomes 4, since bits[23:25] will be zeroes.
So correcting this in the algorithm by using 24 as the right shift value
and compare the loc bit field with the current cache level.
Signed-off-by: sricharan <r.sricharan@ti.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
---
Boot tested this on omap4430sdp.
arch/arm/mm/cache-v7.S | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 07c4bc8..868aa1f 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -45,7 +45,7 @@ ENTRY(v7_flush_dcache_all)
dmb @ ensure ordering with previous memory accesses
mrc p15, 1, r0, c0, c0, 1 @ read clidr
ands r3, r0, #0x7000000 @ extract loc from clidr
- mov r3, r3, lsr #23 @ left align loc bit field
+ mov r3, r3, lsr #24 @ left align loc bit field
beq finished @ if loc is 0, then no need to clean
mov r10, #0 @ start clean at cache level 0
loop1:
@@ -80,7 +80,7 @@ loop3:
bge loop2
skip:
add r10, r10, #2 @ increment cache number
- cmp r3, r10
+ cmp r3, r10, lsr #1
bgt loop1
finished:
mov r10, #0 @ swith back to cache level 0
--
1.7.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH] ARM: Fix the bug in dcache flush all API.
2011-12-16 9:02 [RFC PATCH] ARM: Fix the bug in dcache flush all API sricharan
@ 2011-12-16 10:38 ` Will Deacon
[not found] ` <CAJ7qFScn07xH2t_PSH2jSPG+4NLed7sptWy=UV8Whv=hKgdLRA@mail.gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2011-12-16 10:38 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On Fri, Dec 16, 2011 at 09:02:37AM +0000, sricharan wrote:
> Currently the v7_flush_dcache_all api uses the loc bit field
> in clidr register to find out the last level of cache that has to be
> flushed to achieve the data coherency. The loc bit field is present in
> the register from bits[24:26], but the algorithm uses bits[23:25] as
> the loc bit field. Bit[23] which corresponds to the cache type bit of
> level 8 is zero in most of the architectures. As a example, for a
> level 3 coherency the loc bit field is actually 2, since the bits[23:25]
> are used, the loc becomes 4 (multipled by 2) and algorithm compares this
> with current cace level * 2, which makes it work. But this wont work starting
> from cases where the loc bit field becomes 4, since bits[23:25] will be zeroes.
>
> So correcting this in the algorithm by using 24 as the right shift value
> and compare the loc bit field with the current cache level.
>
> Signed-off-by: sricharan <r.sricharan@ti.com>
> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> ---
> Boot tested this on omap4430sdp.
>
> arch/arm/mm/cache-v7.S | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 07c4bc8..868aa1f 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -45,7 +45,7 @@ ENTRY(v7_flush_dcache_all)
> dmb @ ensure ordering with previous memory accesses
> mrc p15, 1, r0, c0, c0, 1 @ read clidr
> ands r3, r0, #0x7000000 @ extract loc from clidr
> - mov r3, r3, lsr #23 @ left align loc bit field
> + mov r3, r3, lsr #24 @ left align loc bit field
> beq finished @ if loc is 0, then no need to clean
> mov r10, #0 @ start clean at cache level 0
> loop1:
> @@ -80,7 +80,7 @@ loop3:
> bge loop2
> skip:
> add r10, r10, #2 @ increment cache number
> - cmp r3, r10
> + cmp r3, r10, lsr #1
> bgt loop1
> finished:
> mov r10, #0 @ swith back to cache level 0
I don't think this is fixing the problem fully and actually, I've just
spotted another bug where the line size is calculated incorrectly.
When we work out the shift to extract the cache type bits it will go wrong
once we hit the 4th iteration (r10 == 6) because we'll calculate a shift of
10 instead of 9.
I'll try and come up with some patches to fix this function properly since
it seems to broken in a few different ways.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC PATCH] ARM: Fix the bug in dcache flush all API.
[not found] ` <CAJ7qFScn07xH2t_PSH2jSPG+4NLed7sptWy=UV8Whv=hKgdLRA@mail.gmail.com>
@ 2011-12-16 10:57 ` Will Deacon
2011-12-16 11:26 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2011-12-16 10:57 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Dec 16, 2011 at 10:52:48AM +0000, R, Sricharan wrote:
> When we work out the shift to extract the cache type bits it will go wrong
> once we hit the 4th iteration (r10 == 6) because we'll calculate a shift of
> 10 instead of 9.
>
>
> add r2, r10, r10, lsr #1 @ work out 3x current cache
> level
> mov r1, r0, lsr r2 @ extract cache type bits from
> clidr
>
> Here when r10=6, then r2 will be 9 , which is correct shift ?
Yes, of course, I'm on my second coffee now. I still think the line size
calculation is plain wrong though (should be 1 << (ls + 2) instead of (ls +
4)).
I'll take another look@your patch.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC PATCH] ARM: Fix the bug in dcache flush all API.
2011-12-16 10:57 ` Will Deacon
@ 2011-12-16 11:26 ` Will Deacon
2011-12-16 11:40 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2011-12-16 11:26 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Dec 16, 2011 at 10:57:07AM +0000, Will Deacon wrote:
> I'll take another look at your patch.
So on a second look, I'm failing to see the problem you're describing. The
thing that I reckon needs changing is this:
diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index 07c4bc8..5ca3503 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -58,7 +58,7 @@ loop1:
isb @ isb to sych the new cssr&csidr
mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
and r2, r1, #7 @ extract the length of the cache lines
- add r2, r2, #4 @ add 4 (line length offset)
+ add r2, r2, #2 @ add 2 (line length offset)
ldr r4, =0x3ff
ands r4, r4, r1, lsr #3 @ find maximum number on the way size
clz r5, r4 @ find bit position of way size increment
but this function is crazily hard to read and is taken verbatim from the
ARM ARM, so I'll try and find the original author and ask them about it.
If you're convinced it's broken, please could you describe the problem in
more depth? (your original commit message says we use bits [25:23] for loc,
which isn't true - we use [26:24] but only shift it down to bit 1, which
we've masked to 0).
Will
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH] ARM: Fix the bug in dcache flush all API.
2011-12-16 11:26 ` Will Deacon
@ 2011-12-16 11:40 ` Will Deacon
[not found] ` <CAJ7qFSdwKJYVVarbJBUgPYqmY-f4eWdGVmnhvA+fVinJ3Tn1yA@mail.gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2011-12-16 11:40 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Dec 16, 2011 at 11:26:40AM +0000, Will Deacon wrote:
> So on a second look, I'm failing to see the problem you're describing. The
> thing that I reckon needs changing is this:
>
>
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 07c4bc8..5ca3503 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -58,7 +58,7 @@ loop1:
> isb @ isb to sych the new cssr&csidr
> mrc p15, 1, r1, c0, c0, 0 @ read the new csidr
> and r2, r1, #7 @ extract the length of the cache lines
> - add r2, r2, #4 @ add 4 (line length offset)
> + add r2, r2, #2 @ add 2 (line length offset)
> ldr r4, =0x3ff
> ands r4, r4, r1, lsr #3 @ find maximum number on the way size
> clz r5, r4 @ find bit position of way size increment
Actually, this is alright as well. We have to do a conversion from words to
bytes so we just add 2 to the power.
Maybe the only thing that needs changing is the comments :)
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [RFC PATCH] ARM: Fix the bug in dcache flush all API.
[not found] ` <CAJ7qFSdwKJYVVarbJBUgPYqmY-f4eWdGVmnhvA+fVinJ3Tn1yA@mail.gmail.com>
@ 2011-12-16 11:50 ` Will Deacon
0 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2011-12-16 11:50 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Dec 16, 2011 at 11:47:25AM +0000, R, Sricharan wrote:
> On Fri, Dec 16, 2011 at 5:10 PM, Will Deacon <will.deacon@arm.com> wrote:
> Maybe the only thing that needs changing is the comments :)
>
>
> So is my patch ok ?
I fail to see why you need to change anything. Can you explain it more
clearly please (as I mentioned before, your previous commit message wasn't
correct)?
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-12-16 11:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-16 9:02 [RFC PATCH] ARM: Fix the bug in dcache flush all API sricharan
2011-12-16 10:38 ` Will Deacon
[not found] ` <CAJ7qFScn07xH2t_PSH2jSPG+4NLed7sptWy=UV8Whv=hKgdLRA@mail.gmail.com>
2011-12-16 10:57 ` Will Deacon
2011-12-16 11:26 ` Will Deacon
2011-12-16 11:40 ` Will Deacon
[not found] ` <CAJ7qFSdwKJYVVarbJBUgPYqmY-f4eWdGVmnhvA+fVinJ3Tn1yA@mail.gmail.com>
2011-12-16 11:50 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).