* [RFC][PATCH v5] MPC5121 TLB errata workaround
@ 2009-03-16 15:52 David Jander
2009-03-16 16:09 ` David Jander
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: David Jander @ 2009-03-16 15:52 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, Paul Mackerras, Wolfgang Denk, gunnar
Complete workaround for DTLB errata in e300c2/c3/c4 processors.
Due to the bug, the hardware-implemented LRU algorythm always goes to way
1 of the TLB. This fix implements the proposed software workaround in
form of a LRW table for chosing the TLB-way.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David Jander <david@protonic.nl>
---
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 0f4fac5..3971ee4 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -578,9 +578,21 @@ DataLoadTLBMiss:
andc r1,r3,r1 /* PP = user? (rw&dirty? 2: 3): 0 */
mtspr SPRN_RPA,r1
mfspr r3,SPRN_DMISS
+ mfspr r2,SPRN_SRR1 /* Need to restore CR0 */
+ mtcrf 0x80,r2
+#ifdef CONFIG_PPC_MPC512x
+ li r0,1
+ mfspr r1,SPRN_SPRG6
+ rlwinm r2,r3,17,27,31 /* Get Address bits 19:15 */
+ slw r0,r0,r2
+ xor r1,r0,r1
+ srw r0,r1,r2
+ mtspr SPRN_SPRG6,r1
+ mfspr r2,SPRN_SRR1
+ rlwimi r2,r0,31-14,14,14
+ mtspr SPRN_SRR1,r2
+#endif
tlbld r3
- mfspr r3,SPRN_SRR1 /* Need to restore CR0 */
- mtcrf 0x80,r3
rfi
DataAddressInvalid:
mfspr r3,SPRN_SRR1
@@ -646,9 +658,21 @@ DataStoreTLBMiss:
andc r1,r3,r1 /* PP = user? 2: 0 */
mtspr SPRN_RPA,r1
mfspr r3,SPRN_DMISS
+ mfspr r2,SPRN_SRR1 /* Need to restore CR0 */
+ mtcrf 0x80,r2
+#ifdef CONFIG_PPC_MPC512x
+ li r0,1
+ mfspr r1,SPRN_SPRG6
+ rlwinm r2,r3,17,27,31 /* Get Address bits 19:15 */
+ slw r0,r0,r2
+ xor r1,r0,r1
+ srw r0,r1,r2
+ mtspr SPRN_SPRG6,r1
+ mfspr r2,SPRN_SRR1
+ rlwimi r2,r0,31-14,14,14
+ mtspr SPRN_SRR1,r2
+#endif
tlbld r3
- mfspr r3,SPRN_SRR1 /* Need to restore CR0 */
- mtcrf 0x80,r3
rfi
#ifndef CONFIG_ALTIVEC
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-16 15:52 [RFC][PATCH v5] MPC5121 TLB errata workaround David Jander
@ 2009-03-16 16:09 ` David Jander
2009-03-16 17:47 ` Kumar Gala
2009-03-16 16:34 ` Kenneth Johansson
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: David Jander @ 2009-03-16 16:09 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Paul Mackerras, gunnar, Wolfgang Denk
In this patch, I placed the LRW table in SPRG6 like before, but Kumar's code
seems a little more compact, so I decided to use that one and fix it ;-)
It's a pity we seem to have one register short in the handler, so we need to
load SPRN_SRR1 twice :-(
Allthough the code-path now has 1 instruction less than my previous version
most of the time (and 2 instructions more when way is not adjusted),
benchmark results are barely affected by this:
1.- mplayer: Total time: 50.392s (50.316s previous patch)
2.- prboom timedemo: 20.1 fps (19.9 fps previous patch)
3.- memcpy speed: 179 MByte/s (180 Mbyte/s previous patch)
Conclusion: difference not measurable between v4 and v5.
Best regards,
--
David Jander
Protonic Holland.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-16 16:09 ` David Jander
@ 2009-03-16 17:47 ` Kumar Gala
0 siblings, 0 replies; 10+ messages in thread
From: Kumar Gala @ 2009-03-16 17:47 UTC (permalink / raw)
To: David Jander; +Cc: linuxppc-dev, Paul Mackerras, Wolfgang Denk, gunnar
On Mar 16, 2009, at 11:09 AM, David Jander wrote:
>
> In this patch, I placed the LRW table in SPRG6 like before, but
> Kumar's code
> seems a little more compact, so I decided to use that one and fix
> it ;-)
>
> It's a pity we seem to have one register short in the handler, so we
> need to
> load SPRN_SRR1 twice :-(
>
> Allthough the code-path now has 1 instruction less than my previous
> version
> most of the time (and 2 instructions more when way is not adjusted),
> benchmark results are barely affected by this:
>
> 1.- mplayer: Total time: 50.392s (50.316s previous patch)
>
> 2.- prboom timedemo: 20.1 fps (19.9 fps previous patch)
>
> 3.- memcpy speed: 179 MByte/s (180 Mbyte/s previous patch)
>
> Conclusion: difference not measurable between v4 and v5.
Can we try this memory instead of the SPRG to see if any noticeable
difference?
- k
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-16 15:52 [RFC][PATCH v5] MPC5121 TLB errata workaround David Jander
2009-03-16 16:09 ` David Jander
@ 2009-03-16 16:34 ` Kenneth Johansson
2009-03-16 18:05 ` Kumar Gala
2009-06-06 22:07 ` Wolfgang Denk
3 siblings, 0 replies; 10+ messages in thread
From: Kenneth Johansson @ 2009-03-16 16:34 UTC (permalink / raw)
To: David Jander; +Cc: linuxppc-dev, Paul Mackerras, gunnar, Wolfgang Denk
On Mon, 2009-03-16 at 16:52 +0100, David Jander wrote:
> Complete workaround for DTLB errata in e300c2/c3/c4 processors.
>
> Due to the bug, the hardware-implemented LRU algorythm always goes to way
> 1 of the TLB. This fix implements the proposed software workaround in
> form of a LRW table for chosing the TLB-way.
>
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> Signed-off-by: David Jander <david@protonic.nl>
I think we have a winner. with one instruction slot left :)
I tried your V4 and V5 and could not see any difference in speed.
Acked-by: Kenneth Johansson <kenneth@southpole.se>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-16 15:52 [RFC][PATCH v5] MPC5121 TLB errata workaround David Jander
2009-03-16 16:09 ` David Jander
2009-03-16 16:34 ` Kenneth Johansson
@ 2009-03-16 18:05 ` Kumar Gala
2009-03-17 10:38 ` David Jander
2009-06-06 22:07 ` Wolfgang Denk
3 siblings, 1 reply; 10+ messages in thread
From: Kumar Gala @ 2009-03-16 18:05 UTC (permalink / raw)
To: David Jander; +Cc: linuxppc-dev, Paul Mackerras, Wolfgang Denk, gunnar
On Mar 16, 2009, at 10:52 AM, David Jander wrote:
> Complete workaround for DTLB errata in e300c2/c3/c4 processors.
>
> Due to the bug, the hardware-implemented LRU algorythm always goes
> to way
> 1 of the TLB. This fix implements the proposed software workaround in
> form of a LRW table for chosing the TLB-way.
>
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> Signed-off-by: David Jander <david@protonic.nl>
>
> ---
> diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/
> head_32.S
> index 0f4fac5..3971ee4 100644
> --- a/arch/powerpc/kernel/head_32.S
> +++ b/arch/powerpc/kernel/head_32.S
> @@ -578,9 +578,21 @@ DataLoadTLBMiss:
> andc r1,r3,r1 /* PP = user? (rw&dirty? 2: 3): 0 */
> mtspr SPRN_RPA,r1
> mfspr r3,SPRN_DMISS
> + mfspr r2,SPRN_SRR1 /* Need to restore CR0 */
> + mtcrf 0x80,r2
> +#ifdef CONFIG_PPC_MPC512x
> + li r0,1
> + mfspr r1,SPRN_SPRG6
> + rlwinm r2,r3,17,27,31 /* Get Address bits 19:15 */
Don't we want:
rlwinm r2,r3,20,27,31 /* Get address bits 15:19 */
> + slw r0,r0,r2
> + xor r1,r0,r1
> + srw r0,r1,r2
> + mtspr SPRN_SPRG6,r1
> + mfspr r2,SPRN_SRR1
> + rlwimi r2,r0,31-14,14,14
> + mtspr SPRN_SRR1,r2
> +#endif
> tlbld r3
> - mfspr r3,SPRN_SRR1 /* Need to restore CR0 */
> - mtcrf 0x80,r3
> rfi
- k
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-16 18:05 ` Kumar Gala
@ 2009-03-17 10:38 ` David Jander
2009-03-17 12:43 ` Kumar Gala
0 siblings, 1 reply; 10+ messages in thread
From: David Jander @ 2009-03-17 10:38 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, Paul Mackerras, Wolfgang Denk, gunnar
On Monday 16 March 2009 19:05:00 Kumar Gala wrote:
> On Mar 16, 2009, at 10:52 AM, David Jander wrote:
> > Complete workaround for DTLB errata in e300c2/c3/c4 processors.
> >
> > Due to the bug, the hardware-implemented LRU algorythm always goes
> > to way
> > 1 of the TLB. This fix implements the proposed software workaround in
> > form of a LRW table for chosing the TLB-way.
> >
> > Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> > Signed-off-by: David Jander <david@protonic.nl>
> >
> > ---
> > diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/
> > head_32.S
> > index 0f4fac5..3971ee4 100644
> > --- a/arch/powerpc/kernel/head_32.S
> > +++ b/arch/powerpc/kernel/head_32.S
> > @@ -578,9 +578,21 @@ DataLoadTLBMiss:
> > andc r1,r3,r1 /* PP = user? (rw&dirty? 2: 3): 0 */
> > mtspr SPRN_RPA,r1
> > mfspr r3,SPRN_DMISS
> > + mfspr r2,SPRN_SRR1 /* Need to restore CR0 */
> > + mtcrf 0x80,r2
> > +#ifdef CONFIG_PPC_MPC512x
> > + li r0,1
> > + mfspr r1,SPRN_SPRG6
> > + rlwinm r2,r3,17,27,31 /* Get Address bits 19:15 */
>
> Don't we want:
> rlwinm r2,r3,20,27,31 /* Get address bits 15:19 */
Hmm, you are right, I must have accidentally counted LE bit-ordering ;-)
Only funny part is, that I don't measure any performance difference after
fixing this.... *sigh*.
Regards,
--
David Jander
Protonic Holland.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-17 10:38 ` David Jander
@ 2009-03-17 12:43 ` Kumar Gala
0 siblings, 0 replies; 10+ messages in thread
From: Kumar Gala @ 2009-03-17 12:43 UTC (permalink / raw)
To: David Jander; +Cc: linuxppc-dev, Paul Mackerras, Wolfgang Denk, gunnar
On Mar 17, 2009, at 5:38 AM, David Jander wrote:
> On Monday 16 March 2009 19:05:00 Kumar Gala wrote:
>> On Mar 16, 2009, at 10:52 AM, David Jander wrote:
>>> Complete workaround for DTLB errata in e300c2/c3/c4 processors.
>>>
>>> Due to the bug, the hardware-implemented LRU algorythm always goes
>>> to way
>>> 1 of the TLB. This fix implements the proposed software workaround
>>> in
>>> form of a LRW table for chosing the TLB-way.
>>>
>>> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
>>> Signed-off-by: David Jander <david@protonic.nl>
>>>
>>> ---
>>> diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/
>>> head_32.S
>>> index 0f4fac5..3971ee4 100644
>>> --- a/arch/powerpc/kernel/head_32.S
>>> +++ b/arch/powerpc/kernel/head_32.S
>>> @@ -578,9 +578,21 @@ DataLoadTLBMiss:
>>> andc r1,r3,r1 /* PP = user? (rw&dirty? 2: 3): 0 */
>>> mtspr SPRN_RPA,r1
>>> mfspr r3,SPRN_DMISS
>>> + mfspr r2,SPRN_SRR1 /* Need to restore CR0 */
>>> + mtcrf 0x80,r2
>>> +#ifdef CONFIG_PPC_MPC512x
>>> + li r0,1
>>> + mfspr r1,SPRN_SPRG6
>>> + rlwinm r2,r3,17,27,31 /* Get Address bits 19:15 */
>>
>> Don't we want:
>> rlwinm r2,r3,20,27,31 /* Get address bits 15:19 */
>
> Hmm, you are right, I must have accidentally counted LE bit-
> ordering ;-)
> Only funny part is, that I don't measure any performance difference
> after
> fixing this.... *sigh*.
I wouldn't expect it to. You are picking some bits to hash on. You
picked some different bits but the distribution of addresses was
probably similiar.
- k
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-03-16 15:52 [RFC][PATCH v5] MPC5121 TLB errata workaround David Jander
` (2 preceding siblings ...)
2009-03-16 18:05 ` Kumar Gala
@ 2009-06-06 22:07 ` Wolfgang Denk
2009-06-06 22:42 ` Benjamin Herrenschmidt
3 siblings, 1 reply; 10+ messages in thread
From: Wolfgang Denk @ 2009-06-06 22:07 UTC (permalink / raw)
To: David Jander; +Cc: linuxppc-dev, Paul Mackerras, gunnar
Dear David Jander,
In message <200903161652.09747.david.jander@protonic.nl> you wrote:
> Complete workaround for DTLB errata in e300c2/c3/c4 processors.
>
> Due to the bug, the hardware-implemented LRU algorythm always goes to way
> 1 of the TLB. This fix implements the proposed software workaround in
> form of a LRW table for chosing the TLB-way.
>
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> Signed-off-by: David Jander <david@protonic.nl>
What is the actual status of this patch?
Patchwork (http://patchwork.ozlabs.org/patch/24502/) says it's
"superseded" - but by what?
I can't see such code in mainline - what happened to it?
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
One essential to success is that you desire be an all-obsessing one,
your thoughts and aims be co-ordinated, and your energy be concentra-
ted and applied without letup.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-06-06 22:07 ` Wolfgang Denk
@ 2009-06-06 22:42 ` Benjamin Herrenschmidt
2009-06-08 18:16 ` Kumar Gala
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2009-06-06 22:42 UTC (permalink / raw)
To: Wolfgang Denk; +Cc: linuxppc-dev, David Jander, gunnar, Paul Mackerras
On Sun, 2009-06-07 at 00:07 +0200, Wolfgang Denk wrote:
> Dear David Jander,
>
> In message <200903161652.09747.david.jander@protonic.nl> you wrote:
> > Complete workaround for DTLB errata in e300c2/c3/c4 processors.
> >
> > Due to the bug, the hardware-implemented LRU algorythm always goes to way
> > 1 of the TLB. This fix implements the proposed software workaround in
> > form of a LRW table for chosing the TLB-way.
> >
> > Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> > Signed-off-by: David Jander <david@protonic.nl>
>
> What is the actual status of this patch?
>
> Patchwork (http://patchwork.ozlabs.org/patch/24502/) says it's
> "superseded" - but by what?
>
> I can't see such code in mainline - what happened to it?
I can see the code in mainline ... but only in the -data- TLB miss
handler, not the instruction one...
Kumar ? Shouldn't we have the workaround in both ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH v5] MPC5121 TLB errata workaround
2009-06-06 22:42 ` Benjamin Herrenschmidt
@ 2009-06-08 18:16 ` Kumar Gala
0 siblings, 0 replies; 10+ messages in thread
From: Kumar Gala @ 2009-06-08 18:16 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev, Paul Mackerras, David Jander, Wolfgang Denk, gunnar
On Jun 6, 2009, at 5:42 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2009-06-07 at 00:07 +0200, Wolfgang Denk wrote:
>> Dear David Jander,
>>
>> In message <200903161652.09747.david.jander@protonic.nl> you wrote:
>>> Complete workaround for DTLB errata in e300c2/c3/c4 processors.
>>>
>>> Due to the bug, the hardware-implemented LRU algorythm always goes
>>> to way
>>> 1 of the TLB. This fix implements the proposed software workaround
>>> in
>>> form of a LRW table for chosing the TLB-way.
>>>
>>> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
>>> Signed-off-by: David Jander <david@protonic.nl>
>>
>> What is the actual status of this patch?
>>
>> Patchwork (http://patchwork.ozlabs.org/patch/24502/) says it's
>> "superseded" - but by what?
>>
>> I can't see such code in mainline - what happened to it?
>
> I can see the code in mainline ... but only in the -data- TLB miss
> handler, not the instruction one...
>
> Kumar ? Shouldn't we have the workaround in both ?
The errata was only for the d-side. The patch is in the mainline
kernel.
- k
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-06-08 18:16 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-16 15:52 [RFC][PATCH v5] MPC5121 TLB errata workaround David Jander
2009-03-16 16:09 ` David Jander
2009-03-16 17:47 ` Kumar Gala
2009-03-16 16:34 ` Kenneth Johansson
2009-03-16 18:05 ` Kumar Gala
2009-03-17 10:38 ` David Jander
2009-03-17 12:43 ` Kumar Gala
2009-06-06 22:07 ` Wolfgang Denk
2009-06-06 22:42 ` Benjamin Herrenschmidt
2009-06-08 18:16 ` Kumar Gala
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).