* [PATCH v3 0/4] Patches for -next
@ 2010-06-21 9:20 Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 1/4] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Catalin Marinas @ 2010-06-21 9:20 UTC (permalink / raw)
To: linux-arm-kernel
Just reposting these patches, no changes other than rebasing them on
2.6.35-rc3.
I would like to push them to the -next tree and aim to merge them in
2.6.37-rc1 (two merging windows from now).
Please let me know if there are any objections.
Thanks.
Catalin Marinas (4):
ARM: Remove the domain switching on ARMv6k/v7 CPUs
ARM: Use lazy cache flushing on ARMv7 SMP systems
ARM: Assume new page cache pages have dirty D-cache
ARM: Synchronise the I and D caches via set_pte_at() on SMP systems
arch/arm/include/asm/assembler.h | 13 +++---
arch/arm/include/asm/cacheflush.h | 6 +--
arch/arm/include/asm/domain.h | 31 +++++++++++++-
arch/arm/include/asm/futex.h | 9 ++--
arch/arm/include/asm/pgtable.h | 25 ++++++++++-
arch/arm/include/asm/smp_plat.h | 4 ++
arch/arm/include/asm/tlbflush.h | 12 ++++-
arch/arm/include/asm/traps.h | 2 +
arch/arm/include/asm/uaccess.h | 16 ++++---
arch/arm/kernel/entry-armv.S | 4 +-
arch/arm/kernel/fiq.c | 5 ++
arch/arm/kernel/traps.c | 14 ++++--
arch/arm/lib/getuser.S | 13 +++---
arch/arm/lib/putuser.S | 29 +++++++------
arch/arm/lib/uaccess.S | 83 +++++++++++++++++++------------------
arch/arm/mm/Kconfig | 8 ++++
arch/arm/mm/copypage-v4mc.c | 2 -
arch/arm/mm/copypage-v6.c | 2 -
arch/arm/mm/copypage-xscale.c | 2 -
arch/arm/mm/dma-mapping.c | 6 +++
arch/arm/mm/fault-armv.c | 8 ++--
arch/arm/mm/flush.c | 31 +++++++++-----
arch/arm/mm/mmu.c | 6 +--
arch/arm/mm/proc-macros.S | 7 +++
arch/arm/mm/proc-v7.S | 5 +-
25 files changed, 226 insertions(+), 117 deletions(-)
--
Catalin
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/4] ARM: Remove the domain switching on ARMv6k/v7 CPUs
2010-06-21 9:20 [PATCH v3 0/4] Patches for -next Catalin Marinas
@ 2010-06-21 9:20 ` Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Catalin Marinas @ 2010-06-21 9:20 UTC (permalink / raw)
To: linux-arm-kernel
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if HAS_TLS_REG is defined
since writing the TLS value to the high vectors page isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/assembler.h | 13 +++---
arch/arm/include/asm/domain.h | 31 +++++++++++++-
arch/arm/include/asm/futex.h | 9 ++--
arch/arm/include/asm/traps.h | 2 +
arch/arm/include/asm/uaccess.h | 16 ++++---
arch/arm/kernel/entry-armv.S | 4 +-
arch/arm/kernel/fiq.c | 5 ++
arch/arm/kernel/traps.c | 14 +++++-
arch/arm/lib/getuser.S | 13 +++---
arch/arm/lib/putuser.S | 29 +++++++------
arch/arm/lib/uaccess.S | 83 +++++++++++++++++++-------------------
arch/arm/mm/Kconfig | 8 ++++
arch/arm/mm/mmu.c | 6 +--
arch/arm/mm/proc-macros.S | 7 +++
arch/arm/mm/proc-v7.S | 5 +-
15 files changed, 153 insertions(+), 92 deletions(-)
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 6e8f05c..66db132 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -18,6 +18,7 @@
#endif
#include <asm/ptrace.h>
+#include <asm/domain.h>
/*
* Endian independent macros for shifting bytes within registers.
@@ -183,12 +184,12 @@
*/
#ifdef CONFIG_THUMB2_KERNEL
- .macro usraccoff, instr, reg, ptr, inc, off, cond, abort
+ .macro usraccoff, instr, reg, ptr, inc, off, cond, abort, t=T()
9999:
.if \inc == 1
- \instr\cond\()bt \reg, [\ptr, #\off]
+ \instr\cond\()b\()\t\().w \reg, [\ptr, #\off]
.elseif \inc == 4
- \instr\cond\()t \reg, [\ptr, #\off]
+ \instr\cond\()\t\().w \reg, [\ptr, #\off]
.else
.error "Unsupported inc macro argument"
.endif
@@ -223,13 +224,13 @@
#else /* !CONFIG_THUMB2_KERNEL */
- .macro usracc, instr, reg, ptr, inc, cond, rept, abort
+ .macro usracc, instr, reg, ptr, inc, cond, rept, abort, t=T()
.rept \rept
9999:
.if \inc == 1
- \instr\cond\()bt \reg, [\ptr], #\inc
+ \instr\cond\()b\()\t \reg, [\ptr], #\inc
.elseif \inc == 4
- \instr\cond\()t \reg, [\ptr], #\inc
+ \instr\cond\()\t \reg, [\ptr], #\inc
.else
.error "Unsupported inc macro argument"
.endif
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index cc7ef40..af18cea 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -45,13 +45,17 @@
*/
#define DOMAIN_NOACCESS 0
#define DOMAIN_CLIENT 1
+#ifdef CONFIG_CPU_USE_DOMAINS
#define DOMAIN_MANAGER 3
+#else
+#define DOMAIN_MANAGER 1
+#endif
#define domain_val(dom,type) ((type) << (2*(dom)))
#ifndef __ASSEMBLY__
-#ifdef CONFIG_MMU
+#ifdef CONFIG_CPU_USE_DOMAINS
#define set_domain(x) \
do { \
__asm__ __volatile__( \
@@ -74,5 +78,28 @@
#define modify_domain(dom,type) do { } while (0)
#endif
+/*
+ * Generate the T (user) versions of the LDR/STR and related
+ * instructions (inline assembly)
+ */
+#ifdef CONFIG_CPU_USE_DOMAINS
+#define T(instr) #instr "t"
+#else
+#define T(instr) #instr
#endif
-#endif /* !__ASSEMBLY__ */
+
+#else /* __ASSEMBLY__ */
+
+/*
+ * Generate the T (user) versions of the LDR/STR and related
+ * instructions
+ */
+#ifdef CONFIG_CPU_USE_DOMAINS
+#define T(instr) instr ## t
+#else
+#define T(instr) instr
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* !__ASM_PROC_DOMAIN_H */
diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 540a044..b33fe70 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -13,12 +13,13 @@
#include <linux/preempt.h>
#include <linux/uaccess.h>
#include <asm/errno.h>
+#include <asm/domain.h>
#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \
__asm__ __volatile__( \
- "1: ldrt %1, [%2]\n" \
+ "1: " T(ldr) " %1, [%2]\n" \
" " insn "\n" \
- "2: strt %0, [%2]\n" \
+ "2: " T(str) " %0, [%2]\n" \
" mov %0, #0\n" \
"3:\n" \
" .pushsection __ex_table,\"a\"\n" \
@@ -97,10 +98,10 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
pagefault_disable(); /* implies preempt_disable() */
__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
- "1: ldrt %0, [%3]\n"
+ "1: " T(ldr) " %0, [%3]\n"
" teq %0, %1\n"
" it eq @ explicit IT needed for the 2b label\n"
- "2: streqt %2, [%3]\n"
+ "2: " T(streq) " %2, [%3]\n"
"3:\n"
" .pushsection __ex_table,\"a\"\n"
" .align 3\n"
diff --git a/arch/arm/include/asm/traps.h b/arch/arm/include/asm/traps.h
index 491960b..af5d5d1 100644
--- a/arch/arm/include/asm/traps.h
+++ b/arch/arm/include/asm/traps.h
@@ -27,4 +27,6 @@ static inline int in_exception_text(unsigned long ptr)
extern void __init early_trap_init(void);
extern void dump_backtrace_entry(unsigned long where, unsigned long from, unsigned long frame);
+extern void *vectors_page;
+
#endif
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 33e4a48..b293616 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -227,7 +227,7 @@ do { \
#define __get_user_asm_byte(x,addr,err) \
__asm__ __volatile__( \
- "1: ldrbt %1,[%2]\n" \
+ "1: " T(ldrb) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -263,7 +263,7 @@ do { \
#define __get_user_asm_word(x,addr,err) \
__asm__ __volatile__( \
- "1: ldrt %1,[%2]\n" \
+ "1: " T(ldr) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -308,7 +308,7 @@ do { \
#define __put_user_asm_byte(x,__pu_addr,err) \
__asm__ __volatile__( \
- "1: strbt %1,[%2]\n" \
+ "1: " T(strb) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -341,7 +341,7 @@ do { \
#define __put_user_asm_word(x,__pu_addr,err) \
__asm__ __volatile__( \
- "1: strt %1,[%2]\n" \
+ "1: " T(str) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -366,10 +366,10 @@ do { \
#define __put_user_asm_dword(x,__pu_addr,err) \
__asm__ __volatile__( \
- ARM( "1: strt " __reg_oper1 ", [%1], #4\n" ) \
- ARM( "2: strt " __reg_oper0 ", [%1]\n" ) \
- THUMB( "1: strt " __reg_oper1 ", [%1]\n" ) \
- THUMB( "2: strt " __reg_oper0 ", [%1, #4]\n" ) \
+ ARM( "1: " T(str) " " __reg_oper1 ", [%1], #4\n" ) \
+ ARM( "2: " T(str) " " __reg_oper0 ", [%1]\n" ) \
+ THUMB( "1: " T(str) " " __reg_oper1 ", [%1]\n" ) \
+ THUMB( "2: " T(str) " " __reg_oper0 ", [%1, #4]\n" ) \
"3:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 7ee48e7..ba654fa 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -736,7 +736,7 @@ ENTRY(__switch_to)
THUMB( stmia ip!, {r4 - sl, fp} ) @ Store most regs on stack
THUMB( str sp, [ip], #4 )
THUMB( str lr, [ip], #4 )
-#ifdef CONFIG_MMU
+#ifdef CONFIG_CPU_USE_DOMAINS
ldr r6, [r2, #TI_CPU_DOMAIN]
#endif
#if defined(CONFIG_HAS_TLS_REG)
@@ -745,7 +745,7 @@ ENTRY(__switch_to)
mov r4, #0xffff0fff
str r3, [r4, #-15] @ TLS val at 0xffff0ff0
#endif
-#ifdef CONFIG_MMU
+#ifdef CONFIG_CPU_USE_DOMAINS
mcr p15, 0, r6, c3, c0, 0 @ Set domain register
#endif
mov r5, r0
diff --git a/arch/arm/kernel/fiq.c b/arch/arm/kernel/fiq.c
index 6ff7919..d601ef2 100644
--- a/arch/arm/kernel/fiq.c
+++ b/arch/arm/kernel/fiq.c
@@ -45,6 +45,7 @@
#include <asm/fiq.h>
#include <asm/irq.h>
#include <asm/system.h>
+#include <asm/traps.h>
static unsigned long no_fiq_insn;
@@ -77,7 +78,11 @@ int show_fiq_list(struct seq_file *p, void *v)
void set_fiq_handler(void *start, unsigned int length)
{
+#if defined(CONFIG_CPU_USE_DOMAINS)
memcpy((void *)0xffff001c, start, length);
+#else
+ memcpy(vectors_page + 0x1c, start, length);
+#endif
flush_icache_range(0xffff001c, 0xffff001c + length);
if (!vectors_high())
flush_icache_range(0x1c, 0x1c + length);
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 1621e53..6bc57c5 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -36,6 +36,8 @@
static const char *handler[]= { "prefetch abort", "data abort", "address exception", "interrupt" };
+void *vectors_page;
+
#ifdef CONFIG_DEBUG_USER
unsigned int user_debug;
@@ -745,7 +747,11 @@ void __init trap_init(void)
void __init early_trap_init(void)
{
+#if defined(CONFIG_CPU_USE_DOMAINS)
unsigned long vectors = CONFIG_VECTORS_BASE;
+#else
+ unsigned long vectors = (unsigned long)vectors_page;
+#endif
extern char __stubs_start[], __stubs_end[];
extern char __vectors_start[], __vectors_end[];
extern char __kuser_helper_start[], __kuser_helper_end[];
@@ -764,10 +770,10 @@ void __init early_trap_init(void)
* Copy signal return handlers into the vector page, and
* set sigreturn to be a pointer to these.
*/
- memcpy((void *)KERN_SIGRETURN_CODE, sigreturn_codes,
- sizeof(sigreturn_codes));
- memcpy((void *)KERN_RESTART_CODE, syscall_restart_code,
- sizeof(syscall_restart_code));
+ memcpy((void *)(vectors + KERN_SIGRETURN_CODE - CONFIG_VECTORS_BASE),
+ sigreturn_codes, sizeof(sigreturn_codes));
+ memcpy((void *)(vectors + KERN_RESTART_CODE - CONFIG_VECTORS_BASE),
+ syscall_restart_code, sizeof(syscall_restart_code));
flush_icache_range(vectors, vectors + PAGE_SIZE);
modify_domain(DOMAIN_USER, DOMAIN_CLIENT);
diff --git a/arch/arm/lib/getuser.S b/arch/arm/lib/getuser.S
index b1631a7..1b049cd 100644
--- a/arch/arm/lib/getuser.S
+++ b/arch/arm/lib/getuser.S
@@ -28,20 +28,21 @@
*/
#include <linux/linkage.h>
#include <asm/errno.h>
+#include <asm/domain.h>
ENTRY(__get_user_1)
-1: ldrbt r2, [r0]
+1: T(ldrb) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__get_user_1)
ENTRY(__get_user_2)
#ifdef CONFIG_THUMB2_KERNEL
-2: ldrbt r2, [r0]
-3: ldrbt r3, [r0, #1]
+2: T(ldrb) r2, [r0]
+3: T(ldrb) r3, [r0, #1]
#else
-2: ldrbt r2, [r0], #1
-3: ldrbt r3, [r0]
+2: T(ldrb) r2, [r0], #1
+3: T(ldrb) r3, [r0]
#endif
#ifndef __ARMEB__
orr r2, r2, r3, lsl #8
@@ -53,7 +54,7 @@ ENTRY(__get_user_2)
ENDPROC(__get_user_2)
ENTRY(__get_user_4)
-4: ldrt r2, [r0]
+4: T(ldr) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__get_user_4)
diff --git a/arch/arm/lib/putuser.S b/arch/arm/lib/putuser.S
index 5a01a23..c023fc1 100644
--- a/arch/arm/lib/putuser.S
+++ b/arch/arm/lib/putuser.S
@@ -28,9 +28,10 @@
*/
#include <linux/linkage.h>
#include <asm/errno.h>
+#include <asm/domain.h>
ENTRY(__put_user_1)
-1: strbt r2, [r0]
+1: T(strb) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__put_user_1)
@@ -39,19 +40,19 @@ ENTRY(__put_user_2)
mov ip, r2, lsr #8
#ifdef CONFIG_THUMB2_KERNEL
#ifndef __ARMEB__
-2: strbt r2, [r0]
-3: strbt ip, [r0, #1]
+2: T(strb) r2, [r0]
+3: T(strb) ip, [r0, #1]
#else
-2: strbt ip, [r0]
-3: strbt r2, [r0, #1]
+2: T(strb) ip, [r0]
+3: T(strb) r2, [r0, #1]
#endif
#else /* !CONFIG_THUMB2_KERNEL */
#ifndef __ARMEB__
-2: strbt r2, [r0], #1
-3: strbt ip, [r0]
+2: T(strb) r2, [r0], #1
+3: T(strb) ip, [r0]
#else
-2: strbt ip, [r0], #1
-3: strbt r2, [r0]
+2: T(strb) ip, [r0], #1
+3: T(strb) r2, [r0]
#endif
#endif /* CONFIG_THUMB2_KERNEL */
mov r0, #0
@@ -59,18 +60,18 @@ ENTRY(__put_user_2)
ENDPROC(__put_user_2)
ENTRY(__put_user_4)
-4: strt r2, [r0]
+4: T(str) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__put_user_4)
ENTRY(__put_user_8)
#ifdef CONFIG_THUMB2_KERNEL
-5: strt r2, [r0]
-6: strt r3, [r0, #4]
+5: T(str) r2, [r0]
+6: T(str) r3, [r0, #4]
#else
-5: strt r2, [r0], #4
-6: strt r3, [r0]
+5: T(str) r2, [r0], #4
+6: T(str) r3, [r0]
#endif
mov r0, #0
mov pc, lr
diff --git a/arch/arm/lib/uaccess.S b/arch/arm/lib/uaccess.S
index fee9f6f..d0ece2a 100644
--- a/arch/arm/lib/uaccess.S
+++ b/arch/arm/lib/uaccess.S
@@ -14,6 +14,7 @@
#include <linux/linkage.h>
#include <asm/assembler.h>
#include <asm/errno.h>
+#include <asm/domain.h>
.text
@@ -31,11 +32,11 @@
rsb ip, ip, #4
cmp ip, #2
ldrb r3, [r1], #1
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
ldrgeb r3, [r1], #1
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #1
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
sub r2, r2, ip
b .Lc2u_dest_aligned
@@ -58,7 +59,7 @@ ENTRY(__copy_to_user)
addmi ip, r2, #4
bmi .Lc2u_0nowords
ldr r3, [r1], #4
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT @ On each page, use a ld/st??t instruction
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -87,18 +88,18 @@ USER( strt r3, [r0], #4) @ May fault
stmneia r0!, {r3 - r4} @ Shouldnt fault
tst ip, #4
ldrne r3, [r1], #4
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_0fupi
.Lc2u_0nowords: teq ip, #0
beq .Lc2u_finished
.Lc2u_nowords: cmp ip, #2
ldrb r3, [r1], #1
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
ldrgeb r3, [r1], #1
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #1
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
.Lc2u_not_enough:
@@ -119,7 +120,7 @@ USER( strgtbt r3, [r0], #1) @ May fault
mov r3, r7, pull #8
ldr r7, [r1], #4
orr r3, r3, r7, push #24
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -154,18 +155,18 @@ USER( strt r3, [r0], #4) @ May fault
movne r3, r7, pull #8
ldrne r7, [r1], #4
orrne r3, r3, r7, push #24
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_1fupi
.Lc2u_1nowords: mov r3, r7, get_byte_1
teq ip, #0
beq .Lc2u_finished
cmp ip, #2
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
movge r3, r7, get_byte_2
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
movgt r3, r7, get_byte_3
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
.Lc2u_2fupi: subs r2, r2, #4
@@ -174,7 +175,7 @@ USER( strgtbt r3, [r0], #1) @ May fault
mov r3, r7, pull #16
ldr r7, [r1], #4
orr r3, r3, r7, push #16
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -209,18 +210,18 @@ USER( strt r3, [r0], #4) @ May fault
movne r3, r7, pull #16
ldrne r7, [r1], #4
orrne r3, r3, r7, push #16
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_2fupi
.Lc2u_2nowords: mov r3, r7, get_byte_2
teq ip, #0
beq .Lc2u_finished
cmp ip, #2
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
movge r3, r7, get_byte_3
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #0
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
.Lc2u_3fupi: subs r2, r2, #4
@@ -229,7 +230,7 @@ USER( strgtbt r3, [r0], #1) @ May fault
mov r3, r7, pull #24
ldr r7, [r1], #4
orr r3, r3, r7, push #8
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -264,18 +265,18 @@ USER( strt r3, [r0], #4) @ May fault
movne r3, r7, pull #24
ldrne r7, [r1], #4
orrne r3, r3, r7, push #8
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_3fupi
.Lc2u_3nowords: mov r3, r7, get_byte_3
teq ip, #0
beq .Lc2u_finished
cmp ip, #2
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
ldrgeb r3, [r1], #1
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #0
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
ENDPROC(__copy_to_user)
@@ -294,11 +295,11 @@ ENDPROC(__copy_to_user)
.Lcfu_dest_not_aligned:
rsb ip, ip, #4
cmp ip, #2
-USER( ldrbt r3, [r1], #1) @ May fault
+USER( T(ldrb) r3, [r1], #1) @ May fault
strb r3, [r0], #1
-USER( ldrgebt r3, [r1], #1) @ May fault
+USER( T(ldrgeb) r3, [r1], #1) @ May fault
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #1) @ May fault
+USER( T(ldrgtb) r3, [r1], #1) @ May fault
strgtb r3, [r0], #1
sub r2, r2, ip
b .Lcfu_dest_aligned
@@ -321,7 +322,7 @@ ENTRY(__copy_from_user)
.Lcfu_0fupi: subs r2, r2, #4
addmi ip, r2, #4
bmi .Lcfu_0nowords
-USER( ldrt r3, [r1], #4)
+USER( T(ldr) r3, [r1], #4)
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT @ On each page, use a ld/st??t instruction
rsb ip, ip, #0
@@ -350,18 +351,18 @@ USER( ldrt r3, [r1], #4)
ldmneia r1!, {r3 - r4} @ Shouldnt fault
stmneia r0!, {r3 - r4}
tst ip, #4
- ldrnet r3, [r1], #4 @ Shouldnt fault
+ T(ldrne) r3, [r1], #4 @ Shouldnt fault
strne r3, [r0], #4
ands ip, ip, #3
beq .Lcfu_0fupi
.Lcfu_0nowords: teq ip, #0
beq .Lcfu_finished
.Lcfu_nowords: cmp ip, #2
-USER( ldrbt r3, [r1], #1) @ May fault
+USER( T(ldrb) r3, [r1], #1) @ May fault
strb r3, [r0], #1
-USER( ldrgebt r3, [r1], #1) @ May fault
+USER( T(ldrgeb) r3, [r1], #1) @ May fault
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #1) @ May fault
+USER( T(ldrgtb) r3, [r1], #1) @ May fault
strgtb r3, [r0], #1
b .Lcfu_finished
@@ -374,7 +375,7 @@ USER( ldrgtbt r3, [r1], #1) @ May fault
.Lcfu_src_not_aligned:
bic r1, r1, #3
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
cmp ip, #2
bgt .Lcfu_3fupi
beq .Lcfu_2fupi
@@ -382,7 +383,7 @@ USER( ldrt r7, [r1], #4) @ May fault
addmi ip, r2, #4
bmi .Lcfu_1nowords
mov r3, r7, pull #8
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
orr r3, r3, r7, push #24
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT
@@ -417,7 +418,7 @@ USER( ldrt r7, [r1], #4) @ May fault
stmneia r0!, {r3 - r4}
tst ip, #4
movne r3, r7, pull #8
-USER( ldrnet r7, [r1], #4) @ May fault
+USER( T(ldrne) r7, [r1], #4) @ May fault
orrne r3, r3, r7, push #24
strne r3, [r0], #4
ands ip, ip, #3
@@ -437,7 +438,7 @@ USER( ldrnet r7, [r1], #4) @ May fault
addmi ip, r2, #4
bmi .Lcfu_2nowords
mov r3, r7, pull #16
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
orr r3, r3, r7, push #16
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT
@@ -473,7 +474,7 @@ USER( ldrt r7, [r1], #4) @ May fault
stmneia r0!, {r3 - r4}
tst ip, #4
movne r3, r7, pull #16
-USER( ldrnet r7, [r1], #4) @ May fault
+USER( T(ldrne) r7, [r1], #4) @ May fault
orrne r3, r3, r7, push #16
strne r3, [r0], #4
ands ip, ip, #3
@@ -485,7 +486,7 @@ USER( ldrnet r7, [r1], #4) @ May fault
strb r3, [r0], #1
movge r3, r7, get_byte_3
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #0) @ May fault
+USER( T(ldrgtb) r3, [r1], #0) @ May fault
strgtb r3, [r0], #1
b .Lcfu_finished
@@ -493,7 +494,7 @@ USER( ldrgtbt r3, [r1], #0) @ May fault
addmi ip, r2, #4
bmi .Lcfu_3nowords
mov r3, r7, pull #24
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
orr r3, r3, r7, push #8
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT
@@ -528,7 +529,7 @@ USER( ldrt r7, [r1], #4) @ May fault
stmneia r0!, {r3 - r4}
tst ip, #4
movne r3, r7, pull #24
-USER( ldrnet r7, [r1], #4) @ May fault
+USER( T(ldrne) r7, [r1], #4) @ May fault
orrne r3, r3, r7, push #8
strne r3, [r0], #4
ands ip, ip, #3
@@ -538,9 +539,9 @@ USER( ldrnet r7, [r1], #4) @ May fault
beq .Lcfu_finished
cmp ip, #2
strb r3, [r0], #1
-USER( ldrgebt r3, [r1], #1) @ May fault
+USER( T(ldrgeb) r3, [r1], #1) @ May fault
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #1) @ May fault
+USER( T(ldrgtb) r3, [r1], #1) @ May fault
strgtb r3, [r0], #1
b .Lcfu_finished
ENDPROC(__copy_from_user)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 346ae14..f33c422 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -599,6 +599,14 @@ config CPU_CP15_MPU
help
Processor has the CP15 register, which has MPU related registers.
+config CPU_USE_DOMAINS
+ bool
+ depends on MMU
+ default y if !HAS_TLS_REG
+ help
+ This option enables or disables the use of domain switching
+ via the set_fs() function.
+
#
# CPU supports 36-bit I/O
#
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 2858941..499e22d 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -25,6 +25,7 @@
#include <asm/smp_plat.h>
#include <asm/tlb.h>
#include <asm/highmem.h>
+#include <asm/traps.h>
#include <asm/mach/arch.h>
#include <asm/mach/map.h>
@@ -935,12 +936,11 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
{
struct map_desc map;
unsigned long addr;
- void *vectors;
/*
* Allocate the vector page early.
*/
- vectors = alloc_bootmem_low_pages(PAGE_SIZE);
+ vectors_page = alloc_bootmem_low_pages(PAGE_SIZE);
for (addr = VMALLOC_END; addr; addr += PGDIR_SIZE)
pmd_clear(pmd_off_k(addr));
@@ -980,7 +980,7 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
* location (0xffff0000). If we aren't using high-vectors, also
* create a mapping at the low-vectors virtual address.
*/
- map.pfn = __phys_to_pfn(virt_to_phys(vectors));
+ map.pfn = __phys_to_pfn(virt_to_phys(vectors_page));
map.virtual = 0xffff0000;
map.length = PAGE_SIZE;
map.type = MT_HIGH_VECTORS;
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 7d63bea..337f102 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -99,6 +99,10 @@
* 110x 0 1 0 r/w r/o
* 11x0 0 1 0 r/w r/o
* 1111 0 1 1 r/w r/w
+ *
+ * If !CONFIG_CPU_USE_DOMAINS, the following permissions are changed:
+ * 110x 1 1 1 r/o r/o
+ * 11x0 1 1 1 r/o r/o
*/
.macro armv6_mt_table pfx
\pfx\()_mt_table:
@@ -138,8 +142,11 @@
tst r1, #L_PTE_USER
orrne r3, r3, #PTE_EXT_AP1
+#ifdef CONFIG_CPU_USE_DOMAINS
+ @ allow kernel read/write access to read-only user pages
tstne r3, #PTE_EXT_APX
bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
+#endif
tst r1, #L_PTE_EXEC
orreq r3, r3, #PTE_EXT_XN
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 7aaf88a..c1c3fe0 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -152,8 +152,11 @@ ENTRY(cpu_v7_set_pte_ext)
tst r1, #L_PTE_USER
orrne r3, r3, #PTE_EXT_AP1
+#ifdef CONFIG_CPU_USE_DOMAINS
+ @ allow kernel read/write access to read-only user pages
tstne r3, #PTE_EXT_APX
bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
+#endif
tst r1, #L_PTE_EXEC
orreq r3, r3, #PTE_EXT_XN
@@ -240,8 +243,6 @@ __v7_setup:
mcr p15, 0, r10, c2, c0, 2 @ TTB control register
orr r4, r4, #TTB_FLAGS
mcr p15, 0, r4, c2, c0, 1 @ load TTB1
- mov r10, #0x1f @ domains 0, 1 = manager
- mcr p15, 0, r10, c3, c0, 0 @ load domain access register
/*
* Memory region attributes with SCTLR.TRE=1
*
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMP systems
2010-06-21 9:20 [PATCH v3 0/4] Patches for -next Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 1/4] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
@ 2010-06-21 9:20 ` Catalin Marinas
2010-06-21 9:38 ` Russell King - ARM Linux
2010-06-21 9:20 ` [PATCH v3 3/4] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 4/4] ARM: Synchronise the I and D caches via set_pte_at() on SMP systems Catalin Marinas
3 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2010-06-21 9:20 UTC (permalink / raw)
To: linux-arm-kernel
ARMv7 processors like Cortex-A9 broadcast the cache maintenance
operations in hardware. This patch allows the
flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode
similar to the UP case.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/smp_plat.h | 4 ++++
arch/arm/mm/fault-armv.c | 2 --
arch/arm/mm/flush.c | 13 ++++---------
3 files changed, 8 insertions(+), 11 deletions(-)
diff --git a/arch/arm/include/asm/smp_plat.h b/arch/arm/include/asm/smp_plat.h
index e621530..963a338 100644
--- a/arch/arm/include/asm/smp_plat.h
+++ b/arch/arm/include/asm/smp_plat.h
@@ -13,9 +13,13 @@ static inline int tlb_ops_need_broadcast(void)
return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) < 2;
}
+#if !defined(CONFIG_SMP) || __LINUX_ARM_ARCH__ >= 7
+#define cache_ops_need_broadcast() 0
+#else
static inline int cache_ops_need_broadcast(void)
{
return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) < 1;
}
+#endif
#endif
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 9b906de..9d77d90 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -168,10 +168,8 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
return;
mapping = page_mapping(page);
-#ifndef CONFIG_SMP
if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
__flush_dcache_page(mapping, page);
-#endif
if (mapping) {
if (cache_is_vivt())
make_coherent(mapping, vma, addr, ptep, pfn);
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index c6844cb..5ad8711 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -17,6 +17,7 @@
#include <asm/smp_plat.h>
#include <asm/system.h>
#include <asm/tlbflush.h>
+#include <asm/smp_plat.h>
#include "mm.h"
@@ -93,12 +94,10 @@ void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsig
#define flush_pfn_alias(pfn,vaddr) do { } while (0)
#endif
-#ifdef CONFIG_SMP
static void flush_ptrace_access_other(void *args)
{
__flush_icache_all();
}
-#endif
static
void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
@@ -122,11 +121,9 @@ void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
if (vma->vm_flags & VM_EXEC) {
unsigned long addr = (unsigned long)kaddr;
__cpuc_coherent_kern_range(addr, addr + len);
-#ifdef CONFIG_SMP
if (cache_ops_need_broadcast())
smp_call_function(flush_ptrace_access_other,
NULL, 1);
-#endif
}
}
@@ -246,12 +243,10 @@ void flush_dcache_page(struct page *page)
mapping = page_mapping(page);
-#ifndef CONFIG_SMP
- if (!PageHighMem(page) && mapping && !mapping_mapped(mapping))
+ if (!cache_ops_need_broadcast() &&
+ !PageHighMem(page) && mapping && !mapping_mapped(mapping))
set_bit(PG_dcache_dirty, &page->flags);
- else
-#endif
- {
+ else {
__flush_dcache_page(mapping, page);
if (mapping && cache_is_vivt())
__flush_dcache_aliases(mapping, page);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 3/4] ARM: Assume new page cache pages have dirty D-cache
2010-06-21 9:20 [PATCH v3 0/4] Patches for -next Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 1/4] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
@ 2010-06-21 9:20 ` Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 4/4] ARM: Synchronise the I and D caches via set_pte_at() on SMP systems Catalin Marinas
3 siblings, 0 replies; 7+ messages in thread
From: Catalin Marinas @ 2010-06-21 9:20 UTC (permalink / raw)
To: linux-arm-kernel
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().
The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/cacheflush.h | 6 +++---
arch/arm/include/asm/tlbflush.h | 2 +-
arch/arm/mm/copypage-v4mc.c | 2 +-
arch/arm/mm/copypage-v6.c | 2 +-
arch/arm/mm/copypage-xscale.c | 2 +-
arch/arm/mm/dma-mapping.c | 6 ++++++
arch/arm/mm/fault-armv.c | 4 ++--
arch/arm/mm/flush.c | 3 ++-
8 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 4656a24..d3730f0 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -137,10 +137,10 @@
#endif
/*
- * This flag is used to indicate that the page pointed to by a pte
- * is dirty and requires cleaning before returning it to the user.
+ * This flag is used to indicate that the page pointed to by a pte is clean
+ * and does not require cleaning before returning it to the user.
*/
-#define PG_dcache_dirty PG_arch_1
+#define PG_dcache_clean PG_arch_1
/*
* MM Cache Management
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index bd863d8..40a7092 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -552,7 +552,7 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
#endif
/*
- * if PG_dcache_dirty is set for the page, we need to ensure that any
+ * If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
* back to the page.
*/
diff --git a/arch/arm/mm/copypage-v4mc.c b/arch/arm/mm/copypage-v4mc.c
index 598c51a..b806151 100644
--- a/arch/arm/mm/copypage-v4mc.c
+++ b/arch/arm/mm/copypage-v4mc.c
@@ -73,7 +73,7 @@ void v4_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index f55fa10..bdba6c6 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -79,7 +79,7 @@ static void v6_copy_user_highpage_aliasing(struct page *to,
unsigned int offset = CACHE_COLOUR(vaddr);
unsigned long kfrom, kto;
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
/* FIXME: not highmem safe */
diff --git a/arch/arm/mm/copypage-xscale.c b/arch/arm/mm/copypage-xscale.c
index 9920c0a..649bbcd 100644
--- a/arch/arm/mm/copypage-xscale.c
+++ b/arch/arm/mm/copypage-xscale.c
@@ -95,7 +95,7 @@ void xscale_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 13fa536..bb9b612 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -508,6 +508,12 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
outer_inv_range(paddr, paddr + size);
dma_cache_maint_page(page, off, size, dir, dmac_unmap_area);
+
+ /*
+ * Mark the D-cache clean for this page to avoid extra flushing.
+ */
+ if (dir != DMA_TO_DEVICE)
+ set_bit(PG_dcache_clean, &page->flags);
}
EXPORT_SYMBOL(___dma_page_dev_to_cpu);
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 9d77d90..91a691f 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -141,7 +141,7 @@ make_coherent(struct address_space *mapping, struct vm_area_struct *vma,
* a page table, or changing an existing PTE. Basically, there are two
* things that we need to take care of:
*
- * 1. If PG_dcache_dirty is set for the page, we need to ensure
+ * 1. If PG_dcache_clean is not set for the page, we need to ensure
* that any cache entries for the kernels virtual memory
* range are written back to the page.
* 2. If we have multiple shared mappings of the same space in
@@ -168,7 +168,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
return;
mapping = page_mapping(page);
- if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
if (mapping) {
if (cache_is_vivt())
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 5ad8711..6f55e36 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -245,13 +245,14 @@ void flush_dcache_page(struct page *page)
if (!cache_ops_need_broadcast() &&
!PageHighMem(page) && mapping && !mapping_mapped(mapping))
- set_bit(PG_dcache_dirty, &page->flags);
+ clear_bit(PG_dcache_clean, &page->flags);
else {
__flush_dcache_page(mapping, page);
if (mapping && cache_is_vivt())
__flush_dcache_aliases(mapping, page);
else if (mapping)
__flush_icache_all();
+ set_bit(PG_dcache_clean, &page->flags);
}
}
EXPORT_SYMBOL(flush_dcache_page);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 4/4] ARM: Synchronise the I and D caches via set_pte_at() on SMP systems
2010-06-21 9:20 [PATCH v3 0/4] Patches for -next Catalin Marinas
` (2 preceding siblings ...)
2010-06-21 9:20 ` [PATCH v3 3/4] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
@ 2010-06-21 9:20 ` Catalin Marinas
3 siblings, 0 replies; 7+ messages in thread
From: Catalin Marinas @ 2010-06-21 9:20 UTC (permalink / raw)
To: linux-arm-kernel
On SMP systems, there is a small chance of a PTE becoming visible to a
different CPU before the cache maintenance operations in
update_mmu_cache(). This patch follows the IA-64 and PowerPC approach of
synchronising the I and D caches via the set_pte_at() function. In this
case, there is no need for update_mmu_cache() to be implemented since
lazy cache flushing is already handled by the time this function is
called.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/pgtable.h | 25 ++++++++++++++++++++++---
arch/arm/include/asm/tlbflush.h | 10 +++++++++-
arch/arm/mm/fault-armv.c | 2 ++
arch/arm/mm/flush.c | 15 +++++++++++++++
4 files changed, 48 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index ab68cf1..a3752f5 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -278,9 +278,24 @@ extern struct page *empty_zero_page;
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
-#define set_pte_at(mm,addr,ptep,pteval) do { \
- set_pte_ext(ptep, pteval, (addr) >= TASK_SIZE ? 0 : PTE_EXT_NG); \
- } while (0)
+#ifndef CONFIG_SMP
+static inline void __sync_icache_dcache(pte_t pteval)
+{
+}
+#else
+extern void __sync_icache_dcache(pte_t pteval);
+#endif
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pteval)
+{
+ if (addr >= TASK_SIZE)
+ set_pte_ext(ptep, pteval, 0);
+ else {
+ __sync_icache_dcache(pteval);
+ set_pte_ext(ptep, pteval, PTE_EXT_NG);
+ }
+}
/*
* The following only work if pte_present() is true.
@@ -292,6 +307,10 @@ extern struct page *empty_zero_page;
#define pte_young(pte) (pte_val(pte) & L_PTE_YOUNG)
#define pte_special(pte) (0)
+#define pte_present_exec_user(pte) \
+ ((pte_val(pte) & (L_PTE_PRESENT | L_PTE_EXEC | L_PTE_USER)) == \
+ (L_PTE_PRESENT | L_PTE_EXEC | L_PTE_USER))
+
#define PTE_BIT_FUNC(fn,op) \
static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 40a7092..6aabedd 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -554,10 +554,18 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
/*
* If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
- * back to the page.
+ * back to the page. On SMP systems, the cache coherency is handled in the
+ * set_pte_at() function.
*/
+#ifndef CONFIG_SMP
extern void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep);
+#else
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+}
+#endif
#endif
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 91a691f..030f1da 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -28,6 +28,7 @@
static unsigned long shared_pte_mask = L_PTE_MT_BUFFERABLE;
+#ifndef CONFIG_SMP
/*
* We take the easy way out of this problem - we make the
* PTE uncacheable. However, we leave the write buffer on.
@@ -177,6 +178,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
__flush_icache_all();
}
}
+#endif /* !CONFIG_SMP */
/*
* Check whether the write buffer has physical address aliasing
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 6f55e36..2d08a5e 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -212,6 +212,21 @@ static void __flush_dcache_aliases(struct address_space *mapping, struct page *p
flush_dcache_mmap_unlock(mapping);
}
+#ifdef CONFIG_SMP
+void __sync_icache_dcache(pte_t pteval)
+{
+ unsigned long pfn = pte_pfn(pteval);
+
+ if (pfn_valid(pfn) && pte_present_exec_user(pteval)) {
+ struct page *page = pfn_to_page(pfn);
+
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+ __flush_dcache_page(NULL, page);
+ __flush_icache_all();
+ }
+}
+#endif
+
/*
* Ensure cache coherency between kernel mapping and userspace mapping
* of this page.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMP systems
2010-06-21 9:20 ` [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
@ 2010-06-21 9:38 ` Russell King - ARM Linux
2010-06-21 9:42 ` [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMPsystems Catalin Marinas
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2010-06-21 9:38 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 21, 2010 at 10:20:29AM +0100, Catalin Marinas wrote:
> ARMv7 processors like Cortex-A9 broadcast the cache maintenance
> operations in hardware. This patch allows the
> flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode
> similar to the UP case.
No. We know that this trick can't be used on SMP, because update_mmu_cache
is called after the PTE has been established and the page has become
visible to other CPUs in the system.
So this optimization must remain disabled on SMP for correctness.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMPsystems
2010-06-21 9:38 ` Russell King - ARM Linux
@ 2010-06-21 9:42 ` Catalin Marinas
0 siblings, 0 replies; 7+ messages in thread
From: Catalin Marinas @ 2010-06-21 9:42 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2010-06-21 at 10:38 +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 21, 2010 at 10:20:29AM +0100, Catalin Marinas wrote:
> > ARMv7 processors like Cortex-A9 broadcast the cache maintenance
> > operations in hardware. This patch allows the
> > flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode
> > similar to the UP case.
>
> No. We know that this trick can't be used on SMP, because update_mmu_cache
> is called after the PTE has been established and the page has become
> visible to other CPUs in the system.
>
> So this optimization must remain disabled on SMP for correctness.
Even the current code isn't correct because the I-cache is invalidated
via update_mmu_cache() but set_pte_at() gets called before.
So this is handled by patch 4/4 in the series. I can reorder them but I
couldn't see any reason since the current non-lazy behaviour isn't
correct either.
--
Catalin
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-06-21 9:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-21 9:20 [PATCH v3 0/4] Patches for -next Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 1/4] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
2010-06-21 9:38 ` Russell King - ARM Linux
2010-06-21 9:42 ` [PATCH v3 2/4] ARM: Use lazy cache flushing on ARMv7 SMPsystems Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 3/4] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
2010-06-21 9:20 ` [PATCH v3 4/4] ARM: Synchronise the I and D caches via set_pte_at() on SMP systems Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).