* [PATCH V3] Skip unnecessary pte makeup when clearing it
@ 2012-01-30 8:36 bill4carson at gmail.com
2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com
0 siblings, 1 reply; 8+ messages in thread
From: bill4carson at gmail.com @ 2012-01-30 8:36 UTC (permalink / raw)
To: linux-arm-kernel
If we are only about to clear a hardware pte entry, then pte makeup code is
unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
Signeed-off-by: Bill Carson <bill4carson@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de>
Cc: arm-mail-list <linux-arm-kernel@lists.infradead.org>
Changes for V3:
- Fix double tab issue pointed out by Uwe Kleine-K?nig
Changes for V2:
- Use "1f" instead of "set_pte" as label
- Build/bootup test with thumb mode
- checkpatch script shows:
total: 0 errors, 0 warnings, 44 lines checked
./0001-Skip-unnecessary-pte-makeup-when-clearing-it.patch has no obvious style problems and is ready for submission.
arch/arm/mm/proc-macros.S | 10 +++++-----
arch/arm/mm/proc-v7-2level.S | 10 +++++-----
2 files changed, 10 insertions(+), 10 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-01-30 8:36 [PATCH V3] Skip unnecessary pte makeup when clearing it bill4carson at gmail.com
@ 2012-01-30 8:36 ` bill4carson at gmail.com
2012-02-03 6:54 ` Uwe Kleine-König
2012-02-03 11:27 ` Catalin Marinas
0 siblings, 2 replies; 8+ messages in thread
From: bill4carson at gmail.com @ 2012-01-30 8:36 UTC (permalink / raw)
To: linux-arm-kernel
From: Bill Carson <bill4carson@gmail.com>
If we are only about to clear a hardware pte entry, then pte makeup code is
unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
Signed-off-by: Bill Carson <bill4carson@gmail.com>
---
arch/arm/mm/proc-macros.S | 10 +++++-----
arch/arm/mm/proc-v7-2level.S | 10 +++++-----
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 2d8ff3a..907b524 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -138,6 +138,10 @@
.macro armv6_set_pte_ext pfx
str r1, [r0], #2048 @ linux version
+ tst r1, #L_PTE_YOUNG
+ tstne r1, #L_PTE_PRESENT
+ moveq r3, #0
+ beq 1f
bic r3, r1, #0x000003fc
bic r3, r3, #PTE_TYPE_MASK
orr r3, r3, r2
@@ -163,11 +167,7 @@
orrne r3, r3, #PTE_EXT_XN
orr r3, r3, r2
-
- tst r1, #L_PTE_YOUNG
- tstne r1, #L_PTE_PRESENT
- moveq r3, #0
-
+1:
str r3, [r0]
mcr p15, 0, r0, c7, c10, 1 @ flush_pte
.endm
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 3a4b3e7..0ff5338 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -76,6 +76,10 @@ ENTRY(cpu_v7_set_pte_ext)
#ifdef CONFIG_MMU
str r1, [r0] @ linux version
+ tst r1, #L_PTE_YOUNG
+ tstne r1, #L_PTE_PRESENT
+ moveq r3, #0
+ beq 1f
bic r3, r1, #0x000003f0
bic r3, r3, #PTE_TYPE_MASK
orr r3, r3, r2
@@ -98,11 +102,7 @@ ENTRY(cpu_v7_set_pte_ext)
tst r1, #L_PTE_XN
orrne r3, r3, #PTE_EXT_XN
-
- tst r1, #L_PTE_YOUNG
- tstne r1, #L_PTE_PRESENT
- moveq r3, #0
-
+1:
ARM( str r3, [r0, #2048]! )
THUMB( add r0, r0, #2048 )
THUMB( str r3, [r0] )
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com
@ 2012-02-03 6:54 ` Uwe Kleine-König
2012-02-03 7:43 ` bill4carson
2012-02-03 11:27 ` Catalin Marinas
1 sibling, 1 reply; 8+ messages in thread
From: Uwe Kleine-König @ 2012-02-03 6:54 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote:
> From: Bill Carson <bill4carson@gmail.com>
>
> If we are only about to clear a hardware pte entry, then pte makeup code is
> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
>
> Signed-off-by: Bill Carson <bill4carson@gmail.com>
I havn't tested and I don't know if the patch brings any advantages like
increased speed. But AFAICT it doesn't change the behaviour of
armv6_set_pte_ext and cpu_v7_set_pte_ext.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-02-03 6:54 ` Uwe Kleine-König
@ 2012-02-03 7:43 ` bill4carson
2012-02-03 7:48 ` bill4carson
2012-02-03 9:35 ` Uwe Kleine-König
0 siblings, 2 replies; 8+ messages in thread
From: bill4carson @ 2012-02-03 7:43 UTC (permalink / raw)
To: linux-arm-kernel
On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote:
> Hello,
>
> On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote:
>> From: Bill Carson<bill4carson@gmail.com>
>>
>> If we are only about to clear a hardware pte entry, then pte makeup code is
>> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
>>
>> Signed-off-by: Bill Carson<bill4carson@gmail.com>
> I havn't tested and I don't know if the patch brings any advantages like
> increased speed. But AFAICT it doesn't change the behaviour of
> armv6_set_pte_ext and cpu_v7_set_pte_ext.
>
Hi, Uwe
I'm sorry I didn't state the purpose of this patch clearly.
As a matter of fact, it does change the behavior of set_pte_ext :)
Without this patch, the code path when:
set a pte: line 147->173, 176->181
clear a pte: line 147->174, 176->181
Point is line 147->173 takes a lot of cpu cycles to figure out the right r3,
This is only used when set a pte, when clearing a pte, r3 *ALWAYS* has zero
value! which means line 147->173 doesn't need to be executed in such case.
145ENTRY(cpu_v7_set_pte_ext)
146#ifdef CONFIG_MMU
147 str r1, [r0] @ linux version
148
149 bic r3, r1, #0x000003f0
150 bic r3, r3, #PTE_TYPE_MASK
151 orr r3, r3, r2
152 orr r3, r3, #PTE_EXT_AP0 | 2
153
154 tst r1, #1 << 4
155 orrne r3, r3, #PTE_EXT_TEX(1)
156
157 eor r1, r1, #L_PTE_DIRTY
158 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY
159 orrne r3, r3, #PTE_EXT_APX
160
161 tst r1, #L_PTE_USER
162 orrne r3, r3, #PTE_EXT_AP1
163#ifdef CONFIG_CPU_USE_DOMAINS
164 @ allow kernel read/write access to read-only user pages
165 tstne r3, #PTE_EXT_APX
166 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
167#endif
168
169 tst r1, #L_PTE_XN
170 orrne r3, r3, #PTE_EXT_XN
171
172 tst r1, #L_PTE_YOUNG
173 tstne r1, #L_PTE_PRESENT
174 moveq r3, #0
175
176 ARM( str r3, [r0, #2048]! )
177 THUMB( add r0, r0, #2048 )
178 THUMB( str r3, [r0] )
179 mcr p15, 0, r0, c7, c10, 1 @ flush_pte
180#endif
181 mov pc, lr
182ENDPROC(cpu_v7_set_pte_ext)
With this patch, the code path when:
set a pte: line 147->150, 153->181
clear a pte: line 147->152, 176->181
The code path when setting a pte does not change much at all.
But code path of clearing a pte is significantly shorter than before,
and performance enhancement is handy here.
145 ENTRY(cpu_v7_set_pte_ext)
146 #ifdef CONFIG_MMU
147 str r1, [r0] @ linux version
148
149 tst r1, #L_PTE_YOUNG
150 tstne r1, #L_PTE_PRESENT
151 moveq r3, #0
152 beq 1f
153 bic r3, r1, #0x000003f0
154 bic r3, r3, #PTE_TYPE_MASK
155 orr r3, r3, r2
156 orr r3, r3, #PTE_EXT_AP0 | 2
157
158 tst r1, #1 << 4
159 orrne r3, r3, #PTE_EXT_TEX(1)
160
161 eor r1, r1, #L_PTE_DIRTY
162 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY
163 orrne r3, r3, #PTE_EXT_APX
164
165 tst r1, #L_PTE_USER
166 orrne r3, r3, #PTE_EXT_AP1
167 #ifdef CONFIG_CPU_USE_DOMAINS
168 @ allow kernel read/write access to read-only user pages
169 tstne r3, #PTE_EXT_APX
170 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
171 #endif
172
173 tst r1, #L_PTE_XN
174 orrne r3, r3, #PTE_EXT_XN
175
176 1:
177 ARM( str r3, [r0, #2048]! )
178 THUMB( add r0, r0, #2048 )
179 THUMB( str r3, [r0] )
180 mcr p15, 0, r0, c7, c10, 1 @ flush_pte
181 #endif
182 mov pc, lr
ENDPROC(cpu_v7_set_pte_ext)
I hope the above explanation could justify this patch.
Regards
Bill
> Best regards
> Uwe
>
--
I am a slow learner
but I will keep trying to fight for my dreams!
--bill
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-02-03 7:43 ` bill4carson
@ 2012-02-03 7:48 ` bill4carson
2012-02-03 9:35 ` Uwe Kleine-König
1 sibling, 0 replies; 8+ messages in thread
From: bill4carson @ 2012-02-03 7:48 UTC (permalink / raw)
To: linux-arm-kernel
On 2012?02?03? 15:43, bill4carson wrote:
>
>
> On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote:
>> Hello,
>>
>> On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote:
>>> From: Bill Carson<bill4carson@gmail.com>
>>>
>>> If we are only about to clear a hardware pte entry, then pte makeup
>>> code is
>>> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just
>>> skip it.
>>>
>>> Signed-off-by: Bill Carson<bill4carson@gmail.com>
>> I havn't tested and I don't know if the patch brings any advantages like
>> increased speed. But AFAICT it doesn't change the behaviour of
>> armv6_set_pte_ext and cpu_v7_set_pte_ext.
>>
> Hi, Uwe
>
> I'm sorry I didn't state the purpose of this patch clearly.
> As a matter of fact, it does change the behavior of set_pte_ext :)
>
> Without this patch, the code path when:
> set a pte: line 147->173, 176->181
> clear a pte: line 147->174, 176->181
>
> Point is line 147->173 takes a lot of cpu cycles to figure out the
> right r3,
> This is only used when set a pte, when clearing a pte, r3 *ALWAYS* has
> zero
> value! which means line 147->173 doesn't need to be executed in such
> case.
>
to be precisely 149->170
> 145ENTRY(cpu_v7_set_pte_ext)
> 146#ifdef CONFIG_MMU
> 147 str r1, [r0] @ linux version
> 148
> 149 bic r3, r1, #0x000003f0
> 150 bic r3, r3, #PTE_TYPE_MASK
> 151 orr r3, r3, r2
> 152 orr r3, r3, #PTE_EXT_AP0 | 2
> 153
> 154 tst r1, #1 << 4
> 155 orrne r3, r3, #PTE_EXT_TEX(1)
> 156
> 157 eor r1, r1, #L_PTE_DIRTY
> 158 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY
> 159 orrne r3, r3, #PTE_EXT_APX
> 160
> 161 tst r1, #L_PTE_USER
> 162 orrne r3, r3, #PTE_EXT_AP1
> 163#ifdef CONFIG_CPU_USE_DOMAINS
> 164 @ allow kernel read/write access to read-only user pages
> 165 tstne r3, #PTE_EXT_APX
> 166 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
> 167#endif
> 168
> 169 tst r1, #L_PTE_XN
> 170 orrne r3, r3, #PTE_EXT_XN
> 171
> 172 tst r1, #L_PTE_YOUNG
> 173 tstne r1, #L_PTE_PRESENT
> 174 moveq r3, #0
> 175
> 176 ARM( str r3, [r0, #2048]! )
> 177 THUMB( add r0, r0, #2048 )
> 178 THUMB( str r3, [r0] )
> 179 mcr p15, 0, r0, c7, c10, 1 @ flush_pte
> 180#endif
> 181 mov pc, lr
> 182ENDPROC(cpu_v7_set_pte_ext)
>
>
>
> With this patch, the code path when:
> set a pte: line 147->150, 153->181
> clear a pte: line 147->152, 176->181
>
> The code path when setting a pte does not change much at all.
> But code path of clearing a pte is significantly shorter than before,
> and performance enhancement is handy here.
>
> 145 ENTRY(cpu_v7_set_pte_ext)
> 146 #ifdef CONFIG_MMU
> 147 str r1, [r0] @ linux version
> 148
> 149 tst r1, #L_PTE_YOUNG
> 150 tstne r1, #L_PTE_PRESENT
> 151 moveq r3, #0
> 152 beq 1f
> 153 bic r3, r1, #0x000003f0
> 154 bic r3, r3, #PTE_TYPE_MASK
> 155 orr r3, r3, r2
> 156 orr r3, r3, #PTE_EXT_AP0 | 2
> 157
> 158 tst r1, #1 << 4
> 159 orrne r3, r3, #PTE_EXT_TEX(1)
> 160
> 161 eor r1, r1, #L_PTE_DIRTY
> 162 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY
> 163 orrne r3, r3, #PTE_EXT_APX
> 164
> 165 tst r1, #L_PTE_USER
> 166 orrne r3, r3, #PTE_EXT_AP1
> 167 #ifdef CONFIG_CPU_USE_DOMAINS
> 168 @ allow kernel read/write access to read-only user pages
> 169 tstne r3, #PTE_EXT_APX
> 170 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
> 171 #endif
> 172
> 173 tst r1, #L_PTE_XN
> 174 orrne r3, r3, #PTE_EXT_XN
> 175
> 176 1:
> 177 ARM( str r3, [r0, #2048]! )
> 178 THUMB( add r0, r0, #2048 )
> 179 THUMB( str r3, [r0] )
> 180 mcr p15, 0, r0, c7, c10, 1 @ flush_pte
> 181 #endif
> 182 mov pc, lr
> ENDPROC(cpu_v7_set_pte_ext)
>
>
> I hope the above explanation could justify this patch.
>
>
>
> Regards
> Bill
>
>
>
>> Best regards
>> Uwe
>>
>
--
I am a slow learner
but I will keep trying to fight for my dreams!
--bill
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-02-03 7:43 ` bill4carson
2012-02-03 7:48 ` bill4carson
@ 2012-02-03 9:35 ` Uwe Kleine-König
2012-02-03 10:09 ` bill4carson
1 sibling, 1 reply; 8+ messages in thread
From: Uwe Kleine-König @ 2012-02-03 9:35 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Feb 03, 2012 at 03:43:58PM +0800, bill4carson wrote:
>
>
> On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote:
> >Hello,
> >
> >On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote:
> >>From: Bill Carson<bill4carson@gmail.com>
> >>
> >>If we are only about to clear a hardware pte entry, then pte makeup code is
> >>unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
> >>
> >>Signed-off-by: Bill Carson<bill4carson@gmail.com>
> >I havn't tested and I don't know if the patch brings any advantages like
> >increased speed. But AFAICT it doesn't change the behaviour of
> >armv6_set_pte_ext and cpu_v7_set_pte_ext.
> >
> Hi, Uwe
>
> I'm sorry I didn't state the purpose of this patch clearly.
> As a matter of fact, it does change the behavior of set_pte_ext :)
Depends on what you call behaviour (and it's not the 'u' you dropped
that makes a difference :-). I meant that the side effects don't change.
It's only that they are accomplished in a different (probably more
effective) way.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-02-03 9:35 ` Uwe Kleine-König
@ 2012-02-03 10:09 ` bill4carson
0 siblings, 0 replies; 8+ messages in thread
From: bill4carson @ 2012-02-03 10:09 UTC (permalink / raw)
To: linux-arm-kernel
On 2012?02?03? 17:35, Uwe Kleine-K?nig wrote:
> On Fri, Feb 03, 2012 at 03:43:58PM +0800, bill4carson wrote:
>>
>>
>> On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote:
>>> Hello,
>>>
>>> On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote:
>>>> From: Bill Carson<bill4carson@gmail.com>
>>>>
>>>> If we are only about to clear a hardware pte entry, then pte makeup code is
>>>> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
>>>>
>>>> Signed-off-by: Bill Carson<bill4carson@gmail.com>
>>> I havn't tested and I don't know if the patch brings any advantages like
>>> increased speed. But AFAICT it doesn't change the behaviour of
>>> armv6_set_pte_ext and cpu_v7_set_pte_ext.
>>>
>> Hi, Uwe
>>
>> I'm sorry I didn't state the purpose of this patch clearly.
>> As a matter of fact, it does change the behavior of set_pte_ext :)
> Depends on what you call behaviour (and it's not the 'u' you dropped
> that makes a difference :-). I meant that the side effects don't change.
> It's only that they are accomplished in a different (probably more
> effective) way.
>
Thanks for your explanation, I'm getting what you mean now :)
Yes, from outside point of view, set_pte_ext provides exact function as
before, from inside point of view, it will behave faster than before
with this little modification. I see no reason why not do so.
Or am I missing something here?
> Best regards
> Uwe
>
--
I am a slow learner
but I will keep trying to fight for my dreams!
--bill
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it.
2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com
2012-02-03 6:54 ` Uwe Kleine-König
@ 2012-02-03 11:27 ` Catalin Marinas
1 sibling, 0 replies; 8+ messages in thread
From: Catalin Marinas @ 2012-02-03 11:27 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 30, 2012 at 08:36:07AM +0000, bill4carson at gmail.com wrote:
> From: Bill Carson <bill4carson@gmail.com>
>
> If we are only about to clear a hardware pte entry, then pte makeup code is
> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it.
>
> Signed-off-by: Bill Carson <bill4carson@gmail.com>
I'll give similar answer to Uwe - the patch looks fine but I haven't
tested or run any benchmarks to tell whether it's worth.
Anyway,
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-02-03 11:27 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-30 8:36 [PATCH V3] Skip unnecessary pte makeup when clearing it bill4carson at gmail.com
2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com
2012-02-03 6:54 ` Uwe Kleine-König
2012-02-03 7:43 ` bill4carson
2012-02-03 7:48 ` bill4carson
2012-02-03 9:35 ` Uwe Kleine-König
2012-02-03 10:09 ` bill4carson
2012-02-03 11:27 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).