* [PATCH V3] Skip unnecessary pte makeup when clearing it @ 2012-01-30 8:36 bill4carson at gmail.com 2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com 0 siblings, 1 reply; 8+ messages in thread From: bill4carson at gmail.com @ 2012-01-30 8:36 UTC (permalink / raw) To: linux-arm-kernel If we are only about to clear a hardware pte entry, then pte makeup code is unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. Signeed-off-by: Bill Carson <bill4carson@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de> Cc: arm-mail-list <linux-arm-kernel@lists.infradead.org> Changes for V3: - Fix double tab issue pointed out by Uwe Kleine-K?nig Changes for V2: - Use "1f" instead of "set_pte" as label - Build/bootup test with thumb mode - checkpatch script shows: total: 0 errors, 0 warnings, 44 lines checked ./0001-Skip-unnecessary-pte-makeup-when-clearing-it.patch has no obvious style problems and is ready for submission. arch/arm/mm/proc-macros.S | 10 +++++----- arch/arm/mm/proc-v7-2level.S | 10 +++++----- 2 files changed, 10 insertions(+), 10 deletions(-) ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-01-30 8:36 [PATCH V3] Skip unnecessary pte makeup when clearing it bill4carson at gmail.com @ 2012-01-30 8:36 ` bill4carson at gmail.com 2012-02-03 6:54 ` Uwe Kleine-König 2012-02-03 11:27 ` Catalin Marinas 0 siblings, 2 replies; 8+ messages in thread From: bill4carson at gmail.com @ 2012-01-30 8:36 UTC (permalink / raw) To: linux-arm-kernel From: Bill Carson <bill4carson@gmail.com> If we are only about to clear a hardware pte entry, then pte makeup code is unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. Signed-off-by: Bill Carson <bill4carson@gmail.com> --- arch/arm/mm/proc-macros.S | 10 +++++----- arch/arm/mm/proc-v7-2level.S | 10 +++++----- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S index 2d8ff3a..907b524 100644 --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -138,6 +138,10 @@ .macro armv6_set_pte_ext pfx str r1, [r0], #2048 @ linux version + tst r1, #L_PTE_YOUNG + tstne r1, #L_PTE_PRESENT + moveq r3, #0 + beq 1f bic r3, r1, #0x000003fc bic r3, r3, #PTE_TYPE_MASK orr r3, r3, r2 @@ -163,11 +167,7 @@ orrne r3, r3, #PTE_EXT_XN orr r3, r3, r2 - - tst r1, #L_PTE_YOUNG - tstne r1, #L_PTE_PRESENT - moveq r3, #0 - +1: str r3, [r0] mcr p15, 0, r0, c7, c10, 1 @ flush_pte .endm diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S index 3a4b3e7..0ff5338 100644 --- a/arch/arm/mm/proc-v7-2level.S +++ b/arch/arm/mm/proc-v7-2level.S @@ -76,6 +76,10 @@ ENTRY(cpu_v7_set_pte_ext) #ifdef CONFIG_MMU str r1, [r0] @ linux version + tst r1, #L_PTE_YOUNG + tstne r1, #L_PTE_PRESENT + moveq r3, #0 + beq 1f bic r3, r1, #0x000003f0 bic r3, r3, #PTE_TYPE_MASK orr r3, r3, r2 @@ -98,11 +102,7 @@ ENTRY(cpu_v7_set_pte_ext) tst r1, #L_PTE_XN orrne r3, r3, #PTE_EXT_XN - - tst r1, #L_PTE_YOUNG - tstne r1, #L_PTE_PRESENT - moveq r3, #0 - +1: ARM( str r3, [r0, #2048]! ) THUMB( add r0, r0, #2048 ) THUMB( str r3, [r0] ) -- 1.7.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com @ 2012-02-03 6:54 ` Uwe Kleine-König 2012-02-03 7:43 ` bill4carson 2012-02-03 11:27 ` Catalin Marinas 1 sibling, 1 reply; 8+ messages in thread From: Uwe Kleine-König @ 2012-02-03 6:54 UTC (permalink / raw) To: linux-arm-kernel Hello, On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote: > From: Bill Carson <bill4carson@gmail.com> > > If we are only about to clear a hardware pte entry, then pte makeup code is > unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. > > Signed-off-by: Bill Carson <bill4carson@gmail.com> I havn't tested and I don't know if the patch brings any advantages like increased speed. But AFAICT it doesn't change the behaviour of armv6_set_pte_ext and cpu_v7_set_pte_ext. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-K?nig | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-02-03 6:54 ` Uwe Kleine-König @ 2012-02-03 7:43 ` bill4carson 2012-02-03 7:48 ` bill4carson 2012-02-03 9:35 ` Uwe Kleine-König 0 siblings, 2 replies; 8+ messages in thread From: bill4carson @ 2012-02-03 7:43 UTC (permalink / raw) To: linux-arm-kernel On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote: > Hello, > > On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote: >> From: Bill Carson<bill4carson@gmail.com> >> >> If we are only about to clear a hardware pte entry, then pte makeup code is >> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. >> >> Signed-off-by: Bill Carson<bill4carson@gmail.com> > I havn't tested and I don't know if the patch brings any advantages like > increased speed. But AFAICT it doesn't change the behaviour of > armv6_set_pte_ext and cpu_v7_set_pte_ext. > Hi, Uwe I'm sorry I didn't state the purpose of this patch clearly. As a matter of fact, it does change the behavior of set_pte_ext :) Without this patch, the code path when: set a pte: line 147->173, 176->181 clear a pte: line 147->174, 176->181 Point is line 147->173 takes a lot of cpu cycles to figure out the right r3, This is only used when set a pte, when clearing a pte, r3 *ALWAYS* has zero value! which means line 147->173 doesn't need to be executed in such case. 145ENTRY(cpu_v7_set_pte_ext) 146#ifdef CONFIG_MMU 147 str r1, [r0] @ linux version 148 149 bic r3, r1, #0x000003f0 150 bic r3, r3, #PTE_TYPE_MASK 151 orr r3, r3, r2 152 orr r3, r3, #PTE_EXT_AP0 | 2 153 154 tst r1, #1 << 4 155 orrne r3, r3, #PTE_EXT_TEX(1) 156 157 eor r1, r1, #L_PTE_DIRTY 158 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY 159 orrne r3, r3, #PTE_EXT_APX 160 161 tst r1, #L_PTE_USER 162 orrne r3, r3, #PTE_EXT_AP1 163#ifdef CONFIG_CPU_USE_DOMAINS 164 @ allow kernel read/write access to read-only user pages 165 tstne r3, #PTE_EXT_APX 166 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0 167#endif 168 169 tst r1, #L_PTE_XN 170 orrne r3, r3, #PTE_EXT_XN 171 172 tst r1, #L_PTE_YOUNG 173 tstne r1, #L_PTE_PRESENT 174 moveq r3, #0 175 176 ARM( str r3, [r0, #2048]! ) 177 THUMB( add r0, r0, #2048 ) 178 THUMB( str r3, [r0] ) 179 mcr p15, 0, r0, c7, c10, 1 @ flush_pte 180#endif 181 mov pc, lr 182ENDPROC(cpu_v7_set_pte_ext) With this patch, the code path when: set a pte: line 147->150, 153->181 clear a pte: line 147->152, 176->181 The code path when setting a pte does not change much at all. But code path of clearing a pte is significantly shorter than before, and performance enhancement is handy here. 145 ENTRY(cpu_v7_set_pte_ext) 146 #ifdef CONFIG_MMU 147 str r1, [r0] @ linux version 148 149 tst r1, #L_PTE_YOUNG 150 tstne r1, #L_PTE_PRESENT 151 moveq r3, #0 152 beq 1f 153 bic r3, r1, #0x000003f0 154 bic r3, r3, #PTE_TYPE_MASK 155 orr r3, r3, r2 156 orr r3, r3, #PTE_EXT_AP0 | 2 157 158 tst r1, #1 << 4 159 orrne r3, r3, #PTE_EXT_TEX(1) 160 161 eor r1, r1, #L_PTE_DIRTY 162 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY 163 orrne r3, r3, #PTE_EXT_APX 164 165 tst r1, #L_PTE_USER 166 orrne r3, r3, #PTE_EXT_AP1 167 #ifdef CONFIG_CPU_USE_DOMAINS 168 @ allow kernel read/write access to read-only user pages 169 tstne r3, #PTE_EXT_APX 170 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0 171 #endif 172 173 tst r1, #L_PTE_XN 174 orrne r3, r3, #PTE_EXT_XN 175 176 1: 177 ARM( str r3, [r0, #2048]! ) 178 THUMB( add r0, r0, #2048 ) 179 THUMB( str r3, [r0] ) 180 mcr p15, 0, r0, c7, c10, 1 @ flush_pte 181 #endif 182 mov pc, lr ENDPROC(cpu_v7_set_pte_ext) I hope the above explanation could justify this patch. Regards Bill > Best regards > Uwe > -- I am a slow learner but I will keep trying to fight for my dreams! --bill ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-02-03 7:43 ` bill4carson @ 2012-02-03 7:48 ` bill4carson 2012-02-03 9:35 ` Uwe Kleine-König 1 sibling, 0 replies; 8+ messages in thread From: bill4carson @ 2012-02-03 7:48 UTC (permalink / raw) To: linux-arm-kernel On 2012?02?03? 15:43, bill4carson wrote: > > > On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote: >> Hello, >> >> On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote: >>> From: Bill Carson<bill4carson@gmail.com> >>> >>> If we are only about to clear a hardware pte entry, then pte makeup >>> code is >>> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just >>> skip it. >>> >>> Signed-off-by: Bill Carson<bill4carson@gmail.com> >> I havn't tested and I don't know if the patch brings any advantages like >> increased speed. But AFAICT it doesn't change the behaviour of >> armv6_set_pte_ext and cpu_v7_set_pte_ext. >> > Hi, Uwe > > I'm sorry I didn't state the purpose of this patch clearly. > As a matter of fact, it does change the behavior of set_pte_ext :) > > Without this patch, the code path when: > set a pte: line 147->173, 176->181 > clear a pte: line 147->174, 176->181 > > Point is line 147->173 takes a lot of cpu cycles to figure out the > right r3, > This is only used when set a pte, when clearing a pte, r3 *ALWAYS* has > zero > value! which means line 147->173 doesn't need to be executed in such > case. > to be precisely 149->170 > 145ENTRY(cpu_v7_set_pte_ext) > 146#ifdef CONFIG_MMU > 147 str r1, [r0] @ linux version > 148 > 149 bic r3, r1, #0x000003f0 > 150 bic r3, r3, #PTE_TYPE_MASK > 151 orr r3, r3, r2 > 152 orr r3, r3, #PTE_EXT_AP0 | 2 > 153 > 154 tst r1, #1 << 4 > 155 orrne r3, r3, #PTE_EXT_TEX(1) > 156 > 157 eor r1, r1, #L_PTE_DIRTY > 158 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY > 159 orrne r3, r3, #PTE_EXT_APX > 160 > 161 tst r1, #L_PTE_USER > 162 orrne r3, r3, #PTE_EXT_AP1 > 163#ifdef CONFIG_CPU_USE_DOMAINS > 164 @ allow kernel read/write access to read-only user pages > 165 tstne r3, #PTE_EXT_APX > 166 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0 > 167#endif > 168 > 169 tst r1, #L_PTE_XN > 170 orrne r3, r3, #PTE_EXT_XN > 171 > 172 tst r1, #L_PTE_YOUNG > 173 tstne r1, #L_PTE_PRESENT > 174 moveq r3, #0 > 175 > 176 ARM( str r3, [r0, #2048]! ) > 177 THUMB( add r0, r0, #2048 ) > 178 THUMB( str r3, [r0] ) > 179 mcr p15, 0, r0, c7, c10, 1 @ flush_pte > 180#endif > 181 mov pc, lr > 182ENDPROC(cpu_v7_set_pte_ext) > > > > With this patch, the code path when: > set a pte: line 147->150, 153->181 > clear a pte: line 147->152, 176->181 > > The code path when setting a pte does not change much at all. > But code path of clearing a pte is significantly shorter than before, > and performance enhancement is handy here. > > 145 ENTRY(cpu_v7_set_pte_ext) > 146 #ifdef CONFIG_MMU > 147 str r1, [r0] @ linux version > 148 > 149 tst r1, #L_PTE_YOUNG > 150 tstne r1, #L_PTE_PRESENT > 151 moveq r3, #0 > 152 beq 1f > 153 bic r3, r1, #0x000003f0 > 154 bic r3, r3, #PTE_TYPE_MASK > 155 orr r3, r3, r2 > 156 orr r3, r3, #PTE_EXT_AP0 | 2 > 157 > 158 tst r1, #1 << 4 > 159 orrne r3, r3, #PTE_EXT_TEX(1) > 160 > 161 eor r1, r1, #L_PTE_DIRTY > 162 tst r1, #L_PTE_RDONLY | L_PTE_DIRTY > 163 orrne r3, r3, #PTE_EXT_APX > 164 > 165 tst r1, #L_PTE_USER > 166 orrne r3, r3, #PTE_EXT_AP1 > 167 #ifdef CONFIG_CPU_USE_DOMAINS > 168 @ allow kernel read/write access to read-only user pages > 169 tstne r3, #PTE_EXT_APX > 170 bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0 > 171 #endif > 172 > 173 tst r1, #L_PTE_XN > 174 orrne r3, r3, #PTE_EXT_XN > 175 > 176 1: > 177 ARM( str r3, [r0, #2048]! ) > 178 THUMB( add r0, r0, #2048 ) > 179 THUMB( str r3, [r0] ) > 180 mcr p15, 0, r0, c7, c10, 1 @ flush_pte > 181 #endif > 182 mov pc, lr > ENDPROC(cpu_v7_set_pte_ext) > > > I hope the above explanation could justify this patch. > > > > Regards > Bill > > > >> Best regards >> Uwe >> > -- I am a slow learner but I will keep trying to fight for my dreams! --bill ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-02-03 7:43 ` bill4carson 2012-02-03 7:48 ` bill4carson @ 2012-02-03 9:35 ` Uwe Kleine-König 2012-02-03 10:09 ` bill4carson 1 sibling, 1 reply; 8+ messages in thread From: Uwe Kleine-König @ 2012-02-03 9:35 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 03, 2012 at 03:43:58PM +0800, bill4carson wrote: > > > On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote: > >Hello, > > > >On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote: > >>From: Bill Carson<bill4carson@gmail.com> > >> > >>If we are only about to clear a hardware pte entry, then pte makeup code is > >>unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. > >> > >>Signed-off-by: Bill Carson<bill4carson@gmail.com> > >I havn't tested and I don't know if the patch brings any advantages like > >increased speed. But AFAICT it doesn't change the behaviour of > >armv6_set_pte_ext and cpu_v7_set_pte_ext. > > > Hi, Uwe > > I'm sorry I didn't state the purpose of this patch clearly. > As a matter of fact, it does change the behavior of set_pte_ext :) Depends on what you call behaviour (and it's not the 'u' you dropped that makes a difference :-). I meant that the side effects don't change. It's only that they are accomplished in a different (probably more effective) way. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-K?nig | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-02-03 9:35 ` Uwe Kleine-König @ 2012-02-03 10:09 ` bill4carson 0 siblings, 0 replies; 8+ messages in thread From: bill4carson @ 2012-02-03 10:09 UTC (permalink / raw) To: linux-arm-kernel On 2012?02?03? 17:35, Uwe Kleine-K?nig wrote: > On Fri, Feb 03, 2012 at 03:43:58PM +0800, bill4carson wrote: >> >> >> On 2012?02?03? 14:54, Uwe Kleine-K?nig wrote: >>> Hello, >>> >>> On Mon, Jan 30, 2012 at 04:36:07PM +0800, bill4carson at gmail.com wrote: >>>> From: Bill Carson<bill4carson@gmail.com> >>>> >>>> If we are only about to clear a hardware pte entry, then pte makeup code is >>>> unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. >>>> >>>> Signed-off-by: Bill Carson<bill4carson@gmail.com> >>> I havn't tested and I don't know if the patch brings any advantages like >>> increased speed. But AFAICT it doesn't change the behaviour of >>> armv6_set_pte_ext and cpu_v7_set_pte_ext. >>> >> Hi, Uwe >> >> I'm sorry I didn't state the purpose of this patch clearly. >> As a matter of fact, it does change the behavior of set_pte_ext :) > Depends on what you call behaviour (and it's not the 'u' you dropped > that makes a difference :-). I meant that the side effects don't change. > It's only that they are accomplished in a different (probably more > effective) way. > Thanks for your explanation, I'm getting what you mean now :) Yes, from outside point of view, set_pte_ext provides exact function as before, from inside point of view, it will behave faster than before with this little modification. I see no reason why not do so. Or am I missing something here? > Best regards > Uwe > -- I am a slow learner but I will keep trying to fight for my dreams! --bill ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Skip unnecessary pte makeup when clearing it. 2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com 2012-02-03 6:54 ` Uwe Kleine-König @ 2012-02-03 11:27 ` Catalin Marinas 1 sibling, 0 replies; 8+ messages in thread From: Catalin Marinas @ 2012-02-03 11:27 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 30, 2012 at 08:36:07AM +0000, bill4carson at gmail.com wrote: > From: Bill Carson <bill4carson@gmail.com> > > If we are only about to clear a hardware pte entry, then pte makeup code is > unnecessary for cpu_v7_set_pte_ext and armv6_set_pte_ext, so just skip it. > > Signed-off-by: Bill Carson <bill4carson@gmail.com> I'll give similar answer to Uwe - the patch looks fine but I haven't tested or run any benchmarks to tell whether it's worth. Anyway, Acked-by: Catalin Marinas <catalin.marinas@arm.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-02-03 11:27 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-30 8:36 [PATCH V3] Skip unnecessary pte makeup when clearing it bill4carson at gmail.com 2012-01-30 8:36 ` [PATCH] " bill4carson at gmail.com 2012-02-03 6:54 ` Uwe Kleine-König 2012-02-03 7:43 ` bill4carson 2012-02-03 7:48 ` bill4carson 2012-02-03 9:35 ` Uwe Kleine-König 2012-02-03 10:09 ` bill4carson 2012-02-03 11:27 ` Catalin Marinas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).