* [PATCH 0/9] CM's patches for the next merging window(s)
@ 2010-07-19 13:43 Catalin Marinas
2010-07-19 13:44 ` [PATCH 1/9] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
` (9 more replies)
0 siblings, 10 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:43 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
This is a series of patches I maintain in my tree at (the 'rebased'
branch for individual patches or 'master' for a merge-friendly branch):
http://git.kernel.org/?p=linux/kernel/git/cmarinas/linux-2.6-cm.git
Once acked, I'll send them to Russell's patch system.
As usual, any comments are welcome. Thanks.
Catalin Marinas (9):
ARM: Remove the domain switching on ARMv6k/v7 CPUs
ARM: Assume new page cache pages have dirty D-cache
ARM: Introduce __sync_icache_dcache() for VIPT caches
ARM: Use lazy cache flushing on ARMv7 SMP systems
ARM: Introduce *_relaxed() I/O accessors
ARM: Convert L2x0 to use the IO relaxed operations
ARM: Add barriers to the I/O accessors if ARM_DMA_MEM_BUFFERABLE
ARM: Improve the L2 cache performance when PL310 is used
ARM: Implement phys_mem_access_prot() to avoid attributes aliasing
arch/arm/include/asm/assembler.h | 13 +++--
arch/arm/include/asm/cacheflush.h | 6 +-
arch/arm/include/asm/domain.h | 31 ++++++++++++
arch/arm/include/asm/elf.h | 2 +
arch/arm/include/asm/futex.h | 9 ++--
arch/arm/include/asm/io.h | 40 +++++++++++-----
arch/arm/include/asm/pgtable.h | 29 ++++++++++--
arch/arm/include/asm/smp_plat.h | 4 ++
arch/arm/include/asm/tlbflush.h | 12 ++++-
arch/arm/include/asm/traps.h | 2 +
arch/arm/include/asm/uaccess.h | 16 +++---
arch/arm/kernel/entry-armv.S | 4 +-
arch/arm/kernel/fiq.c | 5 ++
arch/arm/kernel/module.c | 34 ++++++++++++++
arch/arm/kernel/traps.c | 14 ++++--
arch/arm/lib/getuser.S | 13 +++--
arch/arm/lib/putuser.S | 29 ++++++------
arch/arm/lib/uaccess.S | 83 +++++++++++++++++----------------
arch/arm/mm/Kconfig | 16 ++++++
arch/arm/mm/cache-l2x0.c | 41 ++++++++++------
arch/arm/mm/copypage-v4mc.c | 2 -
arch/arm/mm/copypage-v6.c | 2 -
arch/arm/mm/copypage-xscale.c | 2 -
arch/arm/mm/dma-mapping.c | 6 ++
arch/arm/mm/fault-armv.c | 8 ++-
arch/arm/mm/flush.c | 46 ++++++++++++++-----
arch/arm/mm/mmu.c | 20 +++++++-
arch/arm/mm/proc-macros.S | 7 +++
arch/arm/mm/proc-v7.S | 5 +-
drivers/net/smsc911x.c | 92 +++++++++++++++++++++----------------
30 files changed, 408 insertions(+), 185 deletions(-)
--
Catalin
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/9] ARM: Remove the domain switching on ARMv6k/v7 CPUs
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 2/9] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
` (8 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if HAS_TLS_REG is defined
since writing the TLS value to the high vectors page isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
---
arch/arm/include/asm/assembler.h | 13 +++---
arch/arm/include/asm/domain.h | 31 +++++++++++++-
arch/arm/include/asm/futex.h | 9 ++--
arch/arm/include/asm/traps.h | 2 +
arch/arm/include/asm/uaccess.h | 16 ++++---
arch/arm/kernel/entry-armv.S | 4 +-
arch/arm/kernel/fiq.c | 5 ++
arch/arm/kernel/traps.c | 14 +++++-
arch/arm/lib/getuser.S | 13 +++---
arch/arm/lib/putuser.S | 29 +++++++------
arch/arm/lib/uaccess.S | 83 +++++++++++++++++++-------------------
arch/arm/mm/Kconfig | 8 ++++
arch/arm/mm/mmu.c | 6 +--
arch/arm/mm/proc-macros.S | 7 +++
arch/arm/mm/proc-v7.S | 5 +-
15 files changed, 153 insertions(+), 92 deletions(-)
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 6e8f05c..66db132 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -18,6 +18,7 @@
#endif
#include <asm/ptrace.h>
+#include <asm/domain.h>
/*
* Endian independent macros for shifting bytes within registers.
@@ -183,12 +184,12 @@
*/
#ifdef CONFIG_THUMB2_KERNEL
- .macro usraccoff, instr, reg, ptr, inc, off, cond, abort
+ .macro usraccoff, instr, reg, ptr, inc, off, cond, abort, t=T()
9999:
.if \inc == 1
- \instr\cond\()bt \reg, [\ptr, #\off]
+ \instr\cond\()b\()\t\().w \reg, [\ptr, #\off]
.elseif \inc == 4
- \instr\cond\()t \reg, [\ptr, #\off]
+ \instr\cond\()\t\().w \reg, [\ptr, #\off]
.else
.error "Unsupported inc macro argument"
.endif
@@ -223,13 +224,13 @@
#else /* !CONFIG_THUMB2_KERNEL */
- .macro usracc, instr, reg, ptr, inc, cond, rept, abort
+ .macro usracc, instr, reg, ptr, inc, cond, rept, abort, t=T()
.rept \rept
9999:
.if \inc == 1
- \instr\cond\()bt \reg, [\ptr], #\inc
+ \instr\cond\()b\()\t \reg, [\ptr], #\inc
.elseif \inc == 4
- \instr\cond\()t \reg, [\ptr], #\inc
+ \instr\cond\()\t \reg, [\ptr], #\inc
.else
.error "Unsupported inc macro argument"
.endif
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index cc7ef40..af18cea 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -45,13 +45,17 @@
*/
#define DOMAIN_NOACCESS 0
#define DOMAIN_CLIENT 1
+#ifdef CONFIG_CPU_USE_DOMAINS
#define DOMAIN_MANAGER 3
+#else
+#define DOMAIN_MANAGER 1
+#endif
#define domain_val(dom,type) ((type) << (2*(dom)))
#ifndef __ASSEMBLY__
-#ifdef CONFIG_MMU
+#ifdef CONFIG_CPU_USE_DOMAINS
#define set_domain(x) \
do { \
__asm__ __volatile__( \
@@ -74,5 +78,28 @@
#define modify_domain(dom,type) do { } while (0)
#endif
+/*
+ * Generate the T (user) versions of the LDR/STR and related
+ * instructions (inline assembly)
+ */
+#ifdef CONFIG_CPU_USE_DOMAINS
+#define T(instr) #instr "t"
+#else
+#define T(instr) #instr
#endif
-#endif /* !__ASSEMBLY__ */
+
+#else /* __ASSEMBLY__ */
+
+/*
+ * Generate the T (user) versions of the LDR/STR and related
+ * instructions
+ */
+#ifdef CONFIG_CPU_USE_DOMAINS
+#define T(instr) instr ## t
+#else
+#define T(instr) instr
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* !__ASM_PROC_DOMAIN_H */
diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 540a044..b33fe70 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -13,12 +13,13 @@
#include <linux/preempt.h>
#include <linux/uaccess.h>
#include <asm/errno.h>
+#include <asm/domain.h>
#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \
__asm__ __volatile__( \
- "1: ldrt %1, [%2]\n" \
+ "1: " T(ldr) " %1, [%2]\n" \
" " insn "\n" \
- "2: strt %0, [%2]\n" \
+ "2: " T(str) " %0, [%2]\n" \
" mov %0, #0\n" \
"3:\n" \
" .pushsection __ex_table,\"a\"\n" \
@@ -97,10 +98,10 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
pagefault_disable(); /* implies preempt_disable() */
__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
- "1: ldrt %0, [%3]\n"
+ "1: " T(ldr) " %0, [%3]\n"
" teq %0, %1\n"
" it eq @ explicit IT needed for the 2b label\n"
- "2: streqt %2, [%3]\n"
+ "2: " T(streq) " %2, [%3]\n"
"3:\n"
" .pushsection __ex_table,\"a\"\n"
" .align 3\n"
diff --git a/arch/arm/include/asm/traps.h b/arch/arm/include/asm/traps.h
index 491960b..af5d5d1 100644
--- a/arch/arm/include/asm/traps.h
+++ b/arch/arm/include/asm/traps.h
@@ -27,4 +27,6 @@ static inline int in_exception_text(unsigned long ptr)
extern void __init early_trap_init(void);
extern void dump_backtrace_entry(unsigned long where, unsigned long from, unsigned long frame);
+extern void *vectors_page;
+
#endif
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 33e4a48..b293616 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -227,7 +227,7 @@ do { \
#define __get_user_asm_byte(x,addr,err) \
__asm__ __volatile__( \
- "1: ldrbt %1,[%2]\n" \
+ "1: " T(ldrb) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -263,7 +263,7 @@ do { \
#define __get_user_asm_word(x,addr,err) \
__asm__ __volatile__( \
- "1: ldrt %1,[%2]\n" \
+ "1: " T(ldr) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -308,7 +308,7 @@ do { \
#define __put_user_asm_byte(x,__pu_addr,err) \
__asm__ __volatile__( \
- "1: strbt %1,[%2]\n" \
+ "1: " T(strb) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -341,7 +341,7 @@ do { \
#define __put_user_asm_word(x,__pu_addr,err) \
__asm__ __volatile__( \
- "1: strt %1,[%2]\n" \
+ "1: " T(str) " %1,[%2],#0\n" \
"2:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
@@ -366,10 +366,10 @@ do { \
#define __put_user_asm_dword(x,__pu_addr,err) \
__asm__ __volatile__( \
- ARM( "1: strt " __reg_oper1 ", [%1], #4\n" ) \
- ARM( "2: strt " __reg_oper0 ", [%1]\n" ) \
- THUMB( "1: strt " __reg_oper1 ", [%1]\n" ) \
- THUMB( "2: strt " __reg_oper0 ", [%1, #4]\n" ) \
+ ARM( "1: " T(str) " " __reg_oper1 ", [%1], #4\n" ) \
+ ARM( "2: " T(str) " " __reg_oper0 ", [%1]\n" ) \
+ THUMB( "1: " T(str) " " __reg_oper1 ", [%1]\n" ) \
+ THUMB( "2: " T(str) " " __reg_oper0 ", [%1, #4]\n" ) \
"3:\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 7ee48e7..ba654fa 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -736,7 +736,7 @@ ENTRY(__switch_to)
THUMB( stmia ip!, {r4 - sl, fp} ) @ Store most regs on stack
THUMB( str sp, [ip], #4 )
THUMB( str lr, [ip], #4 )
-#ifdef CONFIG_MMU
+#ifdef CONFIG_CPU_USE_DOMAINS
ldr r6, [r2, #TI_CPU_DOMAIN]
#endif
#if defined(CONFIG_HAS_TLS_REG)
@@ -745,7 +745,7 @@ ENTRY(__switch_to)
mov r4, #0xffff0fff
str r3, [r4, #-15] @ TLS val at 0xffff0ff0
#endif
-#ifdef CONFIG_MMU
+#ifdef CONFIG_CPU_USE_DOMAINS
mcr p15, 0, r6, c3, c0, 0 @ Set domain register
#endif
mov r5, r0
diff --git a/arch/arm/kernel/fiq.c b/arch/arm/kernel/fiq.c
index 6ff7919..d601ef2 100644
--- a/arch/arm/kernel/fiq.c
+++ b/arch/arm/kernel/fiq.c
@@ -45,6 +45,7 @@
#include <asm/fiq.h>
#include <asm/irq.h>
#include <asm/system.h>
+#include <asm/traps.h>
static unsigned long no_fiq_insn;
@@ -77,7 +78,11 @@ int show_fiq_list(struct seq_file *p, void *v)
void set_fiq_handler(void *start, unsigned int length)
{
+#if defined(CONFIG_CPU_USE_DOMAINS)
memcpy((void *)0xffff001c, start, length);
+#else
+ memcpy(vectors_page + 0x1c, start, length);
+#endif
flush_icache_range(0xffff001c, 0xffff001c + length);
if (!vectors_high())
flush_icache_range(0x1c, 0x1c + length);
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 1621e53..6bc57c5 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -36,6 +36,8 @@
static const char *handler[]= { "prefetch abort", "data abort", "address exception", "interrupt" };
+void *vectors_page;
+
#ifdef CONFIG_DEBUG_USER
unsigned int user_debug;
@@ -745,7 +747,11 @@ void __init trap_init(void)
void __init early_trap_init(void)
{
+#if defined(CONFIG_CPU_USE_DOMAINS)
unsigned long vectors = CONFIG_VECTORS_BASE;
+#else
+ unsigned long vectors = (unsigned long)vectors_page;
+#endif
extern char __stubs_start[], __stubs_end[];
extern char __vectors_start[], __vectors_end[];
extern char __kuser_helper_start[], __kuser_helper_end[];
@@ -764,10 +770,10 @@ void __init early_trap_init(void)
* Copy signal return handlers into the vector page, and
* set sigreturn to be a pointer to these.
*/
- memcpy((void *)KERN_SIGRETURN_CODE, sigreturn_codes,
- sizeof(sigreturn_codes));
- memcpy((void *)KERN_RESTART_CODE, syscall_restart_code,
- sizeof(syscall_restart_code));
+ memcpy((void *)(vectors + KERN_SIGRETURN_CODE - CONFIG_VECTORS_BASE),
+ sigreturn_codes, sizeof(sigreturn_codes));
+ memcpy((void *)(vectors + KERN_RESTART_CODE - CONFIG_VECTORS_BASE),
+ syscall_restart_code, sizeof(syscall_restart_code));
flush_icache_range(vectors, vectors + PAGE_SIZE);
modify_domain(DOMAIN_USER, DOMAIN_CLIENT);
diff --git a/arch/arm/lib/getuser.S b/arch/arm/lib/getuser.S
index b1631a7..1b049cd 100644
--- a/arch/arm/lib/getuser.S
+++ b/arch/arm/lib/getuser.S
@@ -28,20 +28,21 @@
*/
#include <linux/linkage.h>
#include <asm/errno.h>
+#include <asm/domain.h>
ENTRY(__get_user_1)
-1: ldrbt r2, [r0]
+1: T(ldrb) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__get_user_1)
ENTRY(__get_user_2)
#ifdef CONFIG_THUMB2_KERNEL
-2: ldrbt r2, [r0]
-3: ldrbt r3, [r0, #1]
+2: T(ldrb) r2, [r0]
+3: T(ldrb) r3, [r0, #1]
#else
-2: ldrbt r2, [r0], #1
-3: ldrbt r3, [r0]
+2: T(ldrb) r2, [r0], #1
+3: T(ldrb) r3, [r0]
#endif
#ifndef __ARMEB__
orr r2, r2, r3, lsl #8
@@ -53,7 +54,7 @@ ENTRY(__get_user_2)
ENDPROC(__get_user_2)
ENTRY(__get_user_4)
-4: ldrt r2, [r0]
+4: T(ldr) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__get_user_4)
diff --git a/arch/arm/lib/putuser.S b/arch/arm/lib/putuser.S
index 5a01a23..c023fc1 100644
--- a/arch/arm/lib/putuser.S
+++ b/arch/arm/lib/putuser.S
@@ -28,9 +28,10 @@
*/
#include <linux/linkage.h>
#include <asm/errno.h>
+#include <asm/domain.h>
ENTRY(__put_user_1)
-1: strbt r2, [r0]
+1: T(strb) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__put_user_1)
@@ -39,19 +40,19 @@ ENTRY(__put_user_2)
mov ip, r2, lsr #8
#ifdef CONFIG_THUMB2_KERNEL
#ifndef __ARMEB__
-2: strbt r2, [r0]
-3: strbt ip, [r0, #1]
+2: T(strb) r2, [r0]
+3: T(strb) ip, [r0, #1]
#else
-2: strbt ip, [r0]
-3: strbt r2, [r0, #1]
+2: T(strb) ip, [r0]
+3: T(strb) r2, [r0, #1]
#endif
#else /* !CONFIG_THUMB2_KERNEL */
#ifndef __ARMEB__
-2: strbt r2, [r0], #1
-3: strbt ip, [r0]
+2: T(strb) r2, [r0], #1
+3: T(strb) ip, [r0]
#else
-2: strbt ip, [r0], #1
-3: strbt r2, [r0]
+2: T(strb) ip, [r0], #1
+3: T(strb) r2, [r0]
#endif
#endif /* CONFIG_THUMB2_KERNEL */
mov r0, #0
@@ -59,18 +60,18 @@ ENTRY(__put_user_2)
ENDPROC(__put_user_2)
ENTRY(__put_user_4)
-4: strt r2, [r0]
+4: T(str) r2, [r0]
mov r0, #0
mov pc, lr
ENDPROC(__put_user_4)
ENTRY(__put_user_8)
#ifdef CONFIG_THUMB2_KERNEL
-5: strt r2, [r0]
-6: strt r3, [r0, #4]
+5: T(str) r2, [r0]
+6: T(str) r3, [r0, #4]
#else
-5: strt r2, [r0], #4
-6: strt r3, [r0]
+5: T(str) r2, [r0], #4
+6: T(str) r3, [r0]
#endif
mov r0, #0
mov pc, lr
diff --git a/arch/arm/lib/uaccess.S b/arch/arm/lib/uaccess.S
index fee9f6f..d0ece2a 100644
--- a/arch/arm/lib/uaccess.S
+++ b/arch/arm/lib/uaccess.S
@@ -14,6 +14,7 @@
#include <linux/linkage.h>
#include <asm/assembler.h>
#include <asm/errno.h>
+#include <asm/domain.h>
.text
@@ -31,11 +32,11 @@
rsb ip, ip, #4
cmp ip, #2
ldrb r3, [r1], #1
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
ldrgeb r3, [r1], #1
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #1
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
sub r2, r2, ip
b .Lc2u_dest_aligned
@@ -58,7 +59,7 @@ ENTRY(__copy_to_user)
addmi ip, r2, #4
bmi .Lc2u_0nowords
ldr r3, [r1], #4
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT @ On each page, use a ld/st??t instruction
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -87,18 +88,18 @@ USER( strt r3, [r0], #4) @ May fault
stmneia r0!, {r3 - r4} @ Shouldnt fault
tst ip, #4
ldrne r3, [r1], #4
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_0fupi
.Lc2u_0nowords: teq ip, #0
beq .Lc2u_finished
.Lc2u_nowords: cmp ip, #2
ldrb r3, [r1], #1
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
ldrgeb r3, [r1], #1
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #1
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
.Lc2u_not_enough:
@@ -119,7 +120,7 @@ USER( strgtbt r3, [r0], #1) @ May fault
mov r3, r7, pull #8
ldr r7, [r1], #4
orr r3, r3, r7, push #24
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -154,18 +155,18 @@ USER( strt r3, [r0], #4) @ May fault
movne r3, r7, pull #8
ldrne r7, [r1], #4
orrne r3, r3, r7, push #24
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_1fupi
.Lc2u_1nowords: mov r3, r7, get_byte_1
teq ip, #0
beq .Lc2u_finished
cmp ip, #2
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
movge r3, r7, get_byte_2
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
movgt r3, r7, get_byte_3
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
.Lc2u_2fupi: subs r2, r2, #4
@@ -174,7 +175,7 @@ USER( strgtbt r3, [r0], #1) @ May fault
mov r3, r7, pull #16
ldr r7, [r1], #4
orr r3, r3, r7, push #16
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -209,18 +210,18 @@ USER( strt r3, [r0], #4) @ May fault
movne r3, r7, pull #16
ldrne r7, [r1], #4
orrne r3, r3, r7, push #16
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_2fupi
.Lc2u_2nowords: mov r3, r7, get_byte_2
teq ip, #0
beq .Lc2u_finished
cmp ip, #2
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
movge r3, r7, get_byte_3
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #0
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
.Lc2u_3fupi: subs r2, r2, #4
@@ -229,7 +230,7 @@ USER( strgtbt r3, [r0], #1) @ May fault
mov r3, r7, pull #24
ldr r7, [r1], #4
orr r3, r3, r7, push #8
-USER( strt r3, [r0], #4) @ May fault
+USER( T(str) r3, [r0], #4) @ May fault
mov ip, r0, lsl #32 - PAGE_SHIFT
rsb ip, ip, #0
movs ip, ip, lsr #32 - PAGE_SHIFT
@@ -264,18 +265,18 @@ USER( strt r3, [r0], #4) @ May fault
movne r3, r7, pull #24
ldrne r7, [r1], #4
orrne r3, r3, r7, push #8
- strnet r3, [r0], #4 @ Shouldnt fault
+ T(strne) r3, [r0], #4 @ Shouldnt fault
ands ip, ip, #3
beq .Lc2u_3fupi
.Lc2u_3nowords: mov r3, r7, get_byte_3
teq ip, #0
beq .Lc2u_finished
cmp ip, #2
-USER( strbt r3, [r0], #1) @ May fault
+USER( T(strb) r3, [r0], #1) @ May fault
ldrgeb r3, [r1], #1
-USER( strgebt r3, [r0], #1) @ May fault
+USER( T(strgeb) r3, [r0], #1) @ May fault
ldrgtb r3, [r1], #0
-USER( strgtbt r3, [r0], #1) @ May fault
+USER( T(strgtb) r3, [r0], #1) @ May fault
b .Lc2u_finished
ENDPROC(__copy_to_user)
@@ -294,11 +295,11 @@ ENDPROC(__copy_to_user)
.Lcfu_dest_not_aligned:
rsb ip, ip, #4
cmp ip, #2
-USER( ldrbt r3, [r1], #1) @ May fault
+USER( T(ldrb) r3, [r1], #1) @ May fault
strb r3, [r0], #1
-USER( ldrgebt r3, [r1], #1) @ May fault
+USER( T(ldrgeb) r3, [r1], #1) @ May fault
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #1) @ May fault
+USER( T(ldrgtb) r3, [r1], #1) @ May fault
strgtb r3, [r0], #1
sub r2, r2, ip
b .Lcfu_dest_aligned
@@ -321,7 +322,7 @@ ENTRY(__copy_from_user)
.Lcfu_0fupi: subs r2, r2, #4
addmi ip, r2, #4
bmi .Lcfu_0nowords
-USER( ldrt r3, [r1], #4)
+USER( T(ldr) r3, [r1], #4)
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT @ On each page, use a ld/st??t instruction
rsb ip, ip, #0
@@ -350,18 +351,18 @@ USER( ldrt r3, [r1], #4)
ldmneia r1!, {r3 - r4} @ Shouldnt fault
stmneia r0!, {r3 - r4}
tst ip, #4
- ldrnet r3, [r1], #4 @ Shouldnt fault
+ T(ldrne) r3, [r1], #4 @ Shouldnt fault
strne r3, [r0], #4
ands ip, ip, #3
beq .Lcfu_0fupi
.Lcfu_0nowords: teq ip, #0
beq .Lcfu_finished
.Lcfu_nowords: cmp ip, #2
-USER( ldrbt r3, [r1], #1) @ May fault
+USER( T(ldrb) r3, [r1], #1) @ May fault
strb r3, [r0], #1
-USER( ldrgebt r3, [r1], #1) @ May fault
+USER( T(ldrgeb) r3, [r1], #1) @ May fault
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #1) @ May fault
+USER( T(ldrgtb) r3, [r1], #1) @ May fault
strgtb r3, [r0], #1
b .Lcfu_finished
@@ -374,7 +375,7 @@ USER( ldrgtbt r3, [r1], #1) @ May fault
.Lcfu_src_not_aligned:
bic r1, r1, #3
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
cmp ip, #2
bgt .Lcfu_3fupi
beq .Lcfu_2fupi
@@ -382,7 +383,7 @@ USER( ldrt r7, [r1], #4) @ May fault
addmi ip, r2, #4
bmi .Lcfu_1nowords
mov r3, r7, pull #8
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
orr r3, r3, r7, push #24
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT
@@ -417,7 +418,7 @@ USER( ldrt r7, [r1], #4) @ May fault
stmneia r0!, {r3 - r4}
tst ip, #4
movne r3, r7, pull #8
-USER( ldrnet r7, [r1], #4) @ May fault
+USER( T(ldrne) r7, [r1], #4) @ May fault
orrne r3, r3, r7, push #24
strne r3, [r0], #4
ands ip, ip, #3
@@ -437,7 +438,7 @@ USER( ldrnet r7, [r1], #4) @ May fault
addmi ip, r2, #4
bmi .Lcfu_2nowords
mov r3, r7, pull #16
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
orr r3, r3, r7, push #16
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT
@@ -473,7 +474,7 @@ USER( ldrt r7, [r1], #4) @ May fault
stmneia r0!, {r3 - r4}
tst ip, #4
movne r3, r7, pull #16
-USER( ldrnet r7, [r1], #4) @ May fault
+USER( T(ldrne) r7, [r1], #4) @ May fault
orrne r3, r3, r7, push #16
strne r3, [r0], #4
ands ip, ip, #3
@@ -485,7 +486,7 @@ USER( ldrnet r7, [r1], #4) @ May fault
strb r3, [r0], #1
movge r3, r7, get_byte_3
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #0) @ May fault
+USER( T(ldrgtb) r3, [r1], #0) @ May fault
strgtb r3, [r0], #1
b .Lcfu_finished
@@ -493,7 +494,7 @@ USER( ldrgtbt r3, [r1], #0) @ May fault
addmi ip, r2, #4
bmi .Lcfu_3nowords
mov r3, r7, pull #24
-USER( ldrt r7, [r1], #4) @ May fault
+USER( T(ldr) r7, [r1], #4) @ May fault
orr r3, r3, r7, push #8
str r3, [r0], #4
mov ip, r1, lsl #32 - PAGE_SHIFT
@@ -528,7 +529,7 @@ USER( ldrt r7, [r1], #4) @ May fault
stmneia r0!, {r3 - r4}
tst ip, #4
movne r3, r7, pull #24
-USER( ldrnet r7, [r1], #4) @ May fault
+USER( T(ldrne) r7, [r1], #4) @ May fault
orrne r3, r3, r7, push #8
strne r3, [r0], #4
ands ip, ip, #3
@@ -538,9 +539,9 @@ USER( ldrnet r7, [r1], #4) @ May fault
beq .Lcfu_finished
cmp ip, #2
strb r3, [r0], #1
-USER( ldrgebt r3, [r1], #1) @ May fault
+USER( T(ldrgeb) r3, [r1], #1) @ May fault
strgeb r3, [r0], #1
-USER( ldrgtbt r3, [r1], #1) @ May fault
+USER( T(ldrgtb) r3, [r1], #1) @ May fault
strgtb r3, [r0], #1
b .Lcfu_finished
ENDPROC(__copy_from_user)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 101105e..294a57d 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -599,6 +599,14 @@ config CPU_CP15_MPU
help
Processor has the CP15 register, which has MPU related registers.
+config CPU_USE_DOMAINS
+ bool
+ depends on MMU
+ default y if !HAS_TLS_REG
+ help
+ This option enables or disables the use of domain switching
+ via the set_fs() function.
+
#
# CPU supports 36-bit I/O
#
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 2858941..499e22d 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -25,6 +25,7 @@
#include <asm/smp_plat.h>
#include <asm/tlb.h>
#include <asm/highmem.h>
+#include <asm/traps.h>
#include <asm/mach/arch.h>
#include <asm/mach/map.h>
@@ -935,12 +936,11 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
{
struct map_desc map;
unsigned long addr;
- void *vectors;
/*
* Allocate the vector page early.
*/
- vectors = alloc_bootmem_low_pages(PAGE_SIZE);
+ vectors_page = alloc_bootmem_low_pages(PAGE_SIZE);
for (addr = VMALLOC_END; addr; addr += PGDIR_SIZE)
pmd_clear(pmd_off_k(addr));
@@ -980,7 +980,7 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
* location (0xffff0000). If we aren't using high-vectors, also
* create a mapping at the low-vectors virtual address.
*/
- map.pfn = __phys_to_pfn(virt_to_phys(vectors));
+ map.pfn = __phys_to_pfn(virt_to_phys(vectors_page));
map.virtual = 0xffff0000;
map.length = PAGE_SIZE;
map.type = MT_HIGH_VECTORS;
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 7d63bea..337f102 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -99,6 +99,10 @@
* 110x 0 1 0 r/w r/o
* 11x0 0 1 0 r/w r/o
* 1111 0 1 1 r/w r/w
+ *
+ * If !CONFIG_CPU_USE_DOMAINS, the following permissions are changed:
+ * 110x 1 1 1 r/o r/o
+ * 11x0 1 1 1 r/o r/o
*/
.macro armv6_mt_table pfx
\pfx\()_mt_table:
@@ -138,8 +142,11 @@
tst r1, #L_PTE_USER
orrne r3, r3, #PTE_EXT_AP1
+#ifdef CONFIG_CPU_USE_DOMAINS
+ @ allow kernel read/write access to read-only user pages
tstne r3, #PTE_EXT_APX
bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
+#endif
tst r1, #L_PTE_EXEC
orreq r3, r3, #PTE_EXT_XN
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index 7aaf88a..c1c3fe0 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -152,8 +152,11 @@ ENTRY(cpu_v7_set_pte_ext)
tst r1, #L_PTE_USER
orrne r3, r3, #PTE_EXT_AP1
+#ifdef CONFIG_CPU_USE_DOMAINS
+ @ allow kernel read/write access to read-only user pages
tstne r3, #PTE_EXT_APX
bicne r3, r3, #PTE_EXT_APX | PTE_EXT_AP0
+#endif
tst r1, #L_PTE_EXEC
orreq r3, r3, #PTE_EXT_XN
@@ -240,8 +243,6 @@ __v7_setup:
mcr p15, 0, r10, c2, c0, 2 @ TTB control register
orr r4, r4, #TTB_FLAGS
mcr p15, 0, r4, c2, c0, 1 @ load TTB1
- mov r10, #0x1f @ domains 0, 1 = manager
- mcr p15, 0, r10, c3, c0, 0 @ load domain access register
/*
* Memory region attributes with SCTLR.TRE=1
*
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/9] ARM: Assume new page cache pages have dirty D-cache
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
2010-07-19 13:44 ` [PATCH 1/9] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 3/9] ARM: Introduce __sync_icache_dcache() for VIPT caches Catalin Marinas
` (7 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().
The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
---
arch/arm/include/asm/cacheflush.h | 6 +++---
arch/arm/include/asm/tlbflush.h | 2 +-
arch/arm/mm/copypage-v4mc.c | 2 +-
arch/arm/mm/copypage-v6.c | 2 +-
arch/arm/mm/copypage-xscale.c | 2 +-
arch/arm/mm/dma-mapping.c | 6 ++++++
arch/arm/mm/fault-armv.c | 4 ++--
arch/arm/mm/flush.c | 3 ++-
8 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 4656a24..d3730f0 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -137,10 +137,10 @@
#endif
/*
- * This flag is used to indicate that the page pointed to by a pte
- * is dirty and requires cleaning before returning it to the user.
+ * This flag is used to indicate that the page pointed to by a pte is clean
+ * and does not require cleaning before returning it to the user.
*/
-#define PG_dcache_dirty PG_arch_1
+#define PG_dcache_clean PG_arch_1
/*
* MM Cache Management
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index bd863d8..40a7092 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -552,7 +552,7 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
#endif
/*
- * if PG_dcache_dirty is set for the page, we need to ensure that any
+ * If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
* back to the page.
*/
diff --git a/arch/arm/mm/copypage-v4mc.c b/arch/arm/mm/copypage-v4mc.c
index 598c51a..b806151 100644
--- a/arch/arm/mm/copypage-v4mc.c
+++ b/arch/arm/mm/copypage-v4mc.c
@@ -73,7 +73,7 @@ void v4_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index f55fa10..bdba6c6 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -79,7 +79,7 @@ static void v6_copy_user_highpage_aliasing(struct page *to,
unsigned int offset = CACHE_COLOUR(vaddr);
unsigned long kfrom, kto;
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
/* FIXME: not highmem safe */
diff --git a/arch/arm/mm/copypage-xscale.c b/arch/arm/mm/copypage-xscale.c
index 9920c0a..649bbcd 100644
--- a/arch/arm/mm/copypage-xscale.c
+++ b/arch/arm/mm/copypage-xscale.c
@@ -95,7 +95,7 @@ void xscale_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 9e7742f..fa3d07d 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -508,6 +508,12 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
outer_inv_range(paddr, paddr + size);
dma_cache_maint_page(page, off, size, dir, dmac_unmap_area);
+
+ /*
+ * Mark the D-cache clean for this page to avoid extra flushing.
+ */
+ if (dir != DMA_TO_DEVICE && off == 0 && size >= PAGE_SIZE)
+ set_bit(PG_dcache_clean, &page->flags);
}
EXPORT_SYMBOL(___dma_page_dev_to_cpu);
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 9b906de..58846cb 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -141,7 +141,7 @@ make_coherent(struct address_space *mapping, struct vm_area_struct *vma,
* a page table, or changing an existing PTE. Basically, there are two
* things that we need to take care of:
*
- * 1. If PG_dcache_dirty is set for the page, we need to ensure
+ * 1. If PG_dcache_clean is not set for the page, we need to ensure
* that any cache entries for the kernels virtual memory
* range are written back to the page.
* 2. If we have multiple shared mappings of the same space in
@@ -169,7 +169,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
mapping = page_mapping(page);
#ifndef CONFIG_SMP
- if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
#endif
if (mapping) {
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index c6844cb..9f9919b 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -248,7 +248,7 @@ void flush_dcache_page(struct page *page)
#ifndef CONFIG_SMP
if (!PageHighMem(page) && mapping && !mapping_mapped(mapping))
- set_bit(PG_dcache_dirty, &page->flags);
+ clear_bit(PG_dcache_clean, &page->flags);
else
#endif
{
@@ -257,6 +257,7 @@ void flush_dcache_page(struct page *page)
__flush_dcache_aliases(mapping, page);
else if (mapping)
__flush_icache_all();
+ set_bit(PG_dcache_clean, &page->flags);
}
}
EXPORT_SYMBOL(flush_dcache_page);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/9] ARM: Introduce __sync_icache_dcache() for VIPT caches
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
2010-07-19 13:44 ` [PATCH 1/9] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
2010-07-19 13:44 ` [PATCH 2/9] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 4/9] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
` (6 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
On SMP systems, there is a small chance of a PTE becoming visible to a
different CPU before the current cache maintenance operations in
update_mmu_cache(). To avoid this, cache maintenance must be handled in
set_pte_at() (similar to IA-64 and PowerPC).
This patch provides a unified VIPT cache handling mechanism and
implements the __sync_icache_dcache() function for ARMv6 onwards
architectures. It is called from set_pte_at() and replaces the
update_mmu_cache(). The latter is still used on VIVT hardware where a
vm_area_struct is required.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
---
arch/arm/include/asm/pgtable.h | 26 +++++++++++++++++++++++---
arch/arm/include/asm/tlbflush.h | 10 +++++++++-
arch/arm/mm/fault-armv.c | 4 ++--
arch/arm/mm/flush.c | 30 ++++++++++++++++++++++++++++++
4 files changed, 64 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index ab68cf1..42e694f 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -278,9 +278,24 @@ extern struct page *empty_zero_page;
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
-#define set_pte_at(mm,addr,ptep,pteval) do { \
- set_pte_ext(ptep, pteval, (addr) >= TASK_SIZE ? 0 : PTE_EXT_NG); \
- } while (0)
+#if __LINUX_ARM_ARCH__ < 6
+static inline void __sync_icache_dcache(pte_t pteval)
+{
+}
+#else
+extern void __sync_icache_dcache(pte_t pteval);
+#endif
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pteval)
+{
+ if (addr >= TASK_SIZE)
+ set_pte_ext(ptep, pteval, 0);
+ else {
+ __sync_icache_dcache(pteval);
+ set_pte_ext(ptep, pteval, PTE_EXT_NG);
+ }
+}
/*
* The following only work if pte_present() is true.
@@ -290,8 +305,13 @@ extern struct page *empty_zero_page;
#define pte_write(pte) (pte_val(pte) & L_PTE_WRITE)
#define pte_dirty(pte) (pte_val(pte) & L_PTE_DIRTY)
#define pte_young(pte) (pte_val(pte) & L_PTE_YOUNG)
+#define pte_exec(pte) (pte_val(pte) & L_PTE_EXEC)
#define pte_special(pte) (0)
+#define pte_present_user(pte) \
+ ((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
+ (L_PTE_PRESENT | L_PTE_USER))
+
#define PTE_BIT_FUNC(fn,op) \
static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 40a7092..8ec4775 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -554,10 +554,18 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
/*
* If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
- * back to the page.
+ * back to the page. On ARMv6 and later, the cache coherency is handled via
+ * the set_pte_at() function.
*/
+#if __LINUX_ARM_ARCH__ < 6
extern void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep);
+#else
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+}
+#endif
#endif
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 58846cb..8440d95 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -28,6 +28,7 @@
static unsigned long shared_pte_mask = L_PTE_MT_BUFFERABLE;
+#if __LINUX_ARM_ARCH__ < 6
/*
* We take the easy way out of this problem - we make the
* PTE uncacheable. However, we leave the write buffer on.
@@ -168,10 +169,8 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
return;
mapping = page_mapping(page);
-#ifndef CONFIG_SMP
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
-#endif
if (mapping) {
if (cache_is_vivt())
make_coherent(mapping, vma, addr, ptep, pfn);
@@ -179,6 +178,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
__flush_icache_all();
}
}
+#endif /* __LINUX_ARM_ARCH__ < 6 */
/*
* Check whether the write buffer has physical address aliasing
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 9f9919b..78134f5 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -215,6 +215,36 @@ static void __flush_dcache_aliases(struct address_space *mapping, struct page *p
flush_dcache_mmap_unlock(mapping);
}
+#if __LINUX_ARM_ARCH__ >= 6
+void __sync_icache_dcache(pte_t pteval)
+{
+ unsigned long pfn;
+ struct page *page;
+ struct address_space *mapping;
+
+ if (!pte_present_user(pteval))
+ return;
+ if (cache_is_vipt_nonaliasing() && !pte_exec(pteval))
+ /* only flush non-aliasing VIPT caches for exec mappings */
+ return;
+ pfn = pte_pfn(pteval);
+ if (!pfn_valid(pfn))
+ return;
+
+ page = pfn_to_page(pfn);
+ if (cache_is_vipt_aliasing())
+ mapping = page_mapping(page);
+ else
+ mapping = NULL;
+
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+ __flush_dcache_page(mapping, page);
+ /* pte_exec() already checked above for non-aliasing VIPT cache */
+ if (cache_is_vipt_nonaliasing() || pte_exec(pteval))
+ __flush_icache_all();
+}
+#endif
+
/*
* Ensure cache coherency between kernel mapping and userspace mapping
* of this page.
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/9] ARM: Use lazy cache flushing on ARMv7 SMP systems
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (2 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 3/9] ARM: Introduce __sync_icache_dcache() for VIPT caches Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 5/9] ARM: Introduce *_relaxed() I/O accessors Catalin Marinas
` (5 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
ARMv7 processors like Cortex-A9 broadcast the cache maintenance
operations in hardware. This patch allows the
flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode
similar to the UP case.
Note that cache flushing on SMP systems now takes place via the
set_pte_at() call (__sync_icache_dcache) and there is no race with other
CPUs executing code from the new PTE before the cache flushing took
place.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
---
arch/arm/include/asm/smp_plat.h | 4 ++++
arch/arm/mm/flush.c | 13 ++++---------
2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/smp_plat.h b/arch/arm/include/asm/smp_plat.h
index e621530..963a338 100644
--- a/arch/arm/include/asm/smp_plat.h
+++ b/arch/arm/include/asm/smp_plat.h
@@ -13,9 +13,13 @@ static inline int tlb_ops_need_broadcast(void)
return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) < 2;
}
+#if !defined(CONFIG_SMP) || __LINUX_ARM_ARCH__ >= 7
+#define cache_ops_need_broadcast() 0
+#else
static inline int cache_ops_need_broadcast(void)
{
return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) < 1;
}
+#endif
#endif
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 78134f5..53f83a7 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -17,6 +17,7 @@
#include <asm/smp_plat.h>
#include <asm/system.h>
#include <asm/tlbflush.h>
+#include <asm/smp_plat.h>
#include "mm.h"
@@ -93,12 +94,10 @@ void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsig
#define flush_pfn_alias(pfn,vaddr) do { } while (0)
#endif
-#ifdef CONFIG_SMP
static void flush_ptrace_access_other(void *args)
{
__flush_icache_all();
}
-#endif
static
void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
@@ -122,11 +121,9 @@ void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
if (vma->vm_flags & VM_EXEC) {
unsigned long addr = (unsigned long)kaddr;
__cpuc_coherent_kern_range(addr, addr + len);
-#ifdef CONFIG_SMP
if (cache_ops_need_broadcast())
smp_call_function(flush_ptrace_access_other,
NULL, 1);
-#endif
}
}
@@ -276,12 +273,10 @@ void flush_dcache_page(struct page *page)
mapping = page_mapping(page);
-#ifndef CONFIG_SMP
- if (!PageHighMem(page) && mapping && !mapping_mapped(mapping))
+ if (!cache_ops_need_broadcast() &&
+ !PageHighMem(page) && mapping && !mapping_mapped(mapping))
clear_bit(PG_dcache_clean, &page->flags);
- else
-#endif
- {
+ else {
__flush_dcache_page(mapping, page);
if (mapping && cache_is_vivt())
__flush_dcache_aliases(mapping, page);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/9] ARM: Introduce *_relaxed() I/O accessors
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (3 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 4/9] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 6/9] ARM: Convert L2x0 to use the IO relaxed operations Catalin Marinas
` (4 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
This patch introduces readl*_relaxed()/write*_relaxed() as the main I/O
accessors (when __mem_pci is defined). The standard read*()/write*()
macros are now based on the relaxed accessors.
This patch is in preparation for a subsequent patch which adds barriers
to the I/O accessors.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/io.h | 29 +++++++++++++++++------------
1 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index c980156..9db072d 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -179,25 +179,30 @@ extern void _memset_io(volatile void __iomem *, int, size_t);
* IO port primitives for more information.
*/
#ifdef __mem_pci
-#define readb(c) ({ __u8 __v = __raw_readb(__mem_pci(c)); __v; })
-#define readw(c) ({ __u16 __v = le16_to_cpu((__force __le16) \
+#define readb_relaxed(c) ({ u8 __v = __raw_readb(__mem_pci(c)); __v; })
+#define readw_relaxed(c) ({ u16 __v = le16_to_cpu((__force __le16) \
__raw_readw(__mem_pci(c))); __v; })
-#define readl(c) ({ __u32 __v = le32_to_cpu((__force __le32) \
+#define readl_relaxed(c) ({ u32 __v = le32_to_cpu((__force __le32) \
__raw_readl(__mem_pci(c))); __v; })
-#define readb_relaxed(addr) readb(addr)
-#define readw_relaxed(addr) readw(addr)
-#define readl_relaxed(addr) readl(addr)
+
+#define writeb_relaxed(v,c) ((void)__raw_writeb(v,__mem_pci(c)))
+#define writew_relaxed(v,c) ((void)__raw_writew((__force u16) \
+ cpu_to_le16(v),__mem_pci(c)))
+#define writel_relaxed(v,c) ((void)__raw_writel((__force u32) \
+ cpu_to_le32(v),__mem_pci(c)))
+
+#define readb(c) readb_relaxed(c)
+#define readw(c) readw_relaxed(c)
+#define readl(c) readl_relaxed(c)
+
+#define writeb(v,c) writeb_relaxed(v,c)
+#define writew(v,c) writew_relaxed(v,c)
+#define writel(v,c) writel_relaxed(v,c)
#define readsb(p,d,l) __raw_readsb(__mem_pci(p),d,l)
#define readsw(p,d,l) __raw_readsw(__mem_pci(p),d,l)
#define readsl(p,d,l) __raw_readsl(__mem_pci(p),d,l)
-#define writeb(v,c) __raw_writeb(v,__mem_pci(c))
-#define writew(v,c) __raw_writew((__force __u16) \
- cpu_to_le16(v),__mem_pci(c))
-#define writel(v,c) __raw_writel((__force __u32) \
- cpu_to_le32(v),__mem_pci(c))
-
#define writesb(p,d,l) __raw_writesb(__mem_pci(p),d,l)
#define writesw(p,d,l) __raw_writesw(__mem_pci(p),d,l)
#define writesl(p,d,l) __raw_writesl(__mem_pci(p),d,l)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 6/9] ARM: Convert L2x0 to use the IO relaxed operations
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (4 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 5/9] ARM: Introduce *_relaxed() I/O accessors Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 7/9] ARM: Add barriers to the I/O accessors if ARM_DMA_MEM_BUFFERABLE Catalin Marinas
` (3 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
This patch is in preparation for a subsequent patch which adds barriers
to the I/O accessors. Since the mandatory barriers may do an L2 cache
sync, this patch avoids a recursive call into l2x0_cache_sync() via the
write*() accessors and wmb() and a call into l2x0_cache_sync() with the
l2x0_lock held.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/mm/cache-l2x0.c | 26 +++++++++++++-------------
1 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 9819869..532c35c 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -32,14 +32,14 @@ static uint32_t l2x0_way_mask; /* Bitmask of active ways */
static inline void cache_wait(void __iomem *reg, unsigned long mask)
{
/* wait for the operation to complete */
- while (readl(reg) & mask)
+ while (readl_relaxed(reg) & mask)
;
}
static inline void cache_sync(void)
{
void __iomem *base = l2x0_base;
- writel(0, base + L2X0_CACHE_SYNC);
+ writel_relaxed(0, base + L2X0_CACHE_SYNC);
cache_wait(base + L2X0_CACHE_SYNC, 1);
}
@@ -47,14 +47,14 @@ static inline void l2x0_clean_line(unsigned long addr)
{
void __iomem *base = l2x0_base;
cache_wait(base + L2X0_CLEAN_LINE_PA, 1);
- writel(addr, base + L2X0_CLEAN_LINE_PA);
+ writel_relaxed(addr, base + L2X0_CLEAN_LINE_PA);
}
static inline void l2x0_inv_line(unsigned long addr)
{
void __iomem *base = l2x0_base;
cache_wait(base + L2X0_INV_LINE_PA, 1);
- writel(addr, base + L2X0_INV_LINE_PA);
+ writel_relaxed(addr, base + L2X0_INV_LINE_PA);
}
#ifdef CONFIG_PL310_ERRATA_588369
@@ -75,9 +75,9 @@ static inline void l2x0_flush_line(unsigned long addr)
/* Clean by PA followed by Invalidate by PA */
cache_wait(base + L2X0_CLEAN_LINE_PA, 1);
- writel(addr, base + L2X0_CLEAN_LINE_PA);
+ writel_relaxed(addr, base + L2X0_CLEAN_LINE_PA);
cache_wait(base + L2X0_INV_LINE_PA, 1);
- writel(addr, base + L2X0_INV_LINE_PA);
+ writel_relaxed(addr, base + L2X0_INV_LINE_PA);
}
#else
@@ -90,7 +90,7 @@ static inline void l2x0_flush_line(unsigned long addr)
{
void __iomem *base = l2x0_base;
cache_wait(base + L2X0_CLEAN_INV_LINE_PA, 1);
- writel(addr, base + L2X0_CLEAN_INV_LINE_PA);
+ writel_relaxed(addr, base + L2X0_CLEAN_INV_LINE_PA);
}
#endif
@@ -109,7 +109,7 @@ static inline void l2x0_inv_all(void)
/* invalidate all ways */
spin_lock_irqsave(&l2x0_lock, flags);
- writel(l2x0_way_mask, l2x0_base + L2X0_INV_WAY);
+ writel_relaxed(l2x0_way_mask, l2x0_base + L2X0_INV_WAY);
cache_wait(l2x0_base + L2X0_INV_WAY, l2x0_way_mask);
cache_sync();
spin_unlock_irqrestore(&l2x0_lock, flags);
@@ -215,8 +215,8 @@ void __init l2x0_init(void __iomem *base, __u32 aux_val, __u32 aux_mask)
l2x0_base = base;
- cache_id = readl(l2x0_base + L2X0_CACHE_ID);
- aux = readl(l2x0_base + L2X0_AUX_CTRL);
+ cache_id = readl_relaxed(l2x0_base + L2X0_CACHE_ID);
+ aux = readl_relaxed(l2x0_base + L2X0_AUX_CTRL);
/* Determine the number of ways */
switch (cache_id & L2X0_CACHE_ID_PART_MASK) {
@@ -245,17 +245,17 @@ void __init l2x0_init(void __iomem *base, __u32 aux_val, __u32 aux_mask)
* If you are booting from non-secure mode
* accessing the below registers will fault.
*/
- if (!(readl(l2x0_base + L2X0_CTRL) & 1)) {
+ if (!(readl_relaxed(l2x0_base + L2X0_CTRL) & 1)) {
/* l2x0 controller is disabled */
aux &= aux_mask;
aux |= aux_val;
- writel(aux, l2x0_base + L2X0_AUX_CTRL);
+ writel_relaxed(aux, l2x0_base + L2X0_AUX_CTRL);
l2x0_inv_all();
/* enable L2X0 */
- writel(1, l2x0_base + L2X0_CTRL);
+ writel_relaxed(1, l2x0_base + L2X0_CTRL);
}
outer_cache.inv_range = l2x0_inv_range;
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 7/9] ARM: Add barriers to the I/O accessors if ARM_DMA_MEM_BUFFERABLE
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (5 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 6/9] ARM: Convert L2x0 to use the IO relaxed operations Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 8/9] ARM: Improve the L2 cache performance when PL310 is used Catalin Marinas
` (2 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
When the coherent DMA buffers are mapped as Normal Non-cacheable
(ARM_DMA_MEM_BUFFERABLE enabled), buffer accesses are no longer ordered
with Device memory accesses causing failures in device drivers that do
not use the mandatory memory barriers before starting a DMA transfer.
LKML discussions led to the conclusion that such barriers have to be
added to the I/O accessors:
http://thread.gmane.org/gmane.linux.kernel/683509/focus=686153
http://thread.gmane.org/gmane.linux.ide/46414
http://thread.gmane.org/gmane.linux.kernel.cross-arch/5250
This patch introduces a wmb() barrier to the write*() I/O accessors to
handle the situations where Normal Non-cacheable writes are still in the
processor (or L2 cache controller) write buffer before a DMA transfer
command is issued. For the read*() accessors, a rmb() is introduced
after the I/O to avoid speculative loads where the driver polls for a
DMA transfer ready bit.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/io.h | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index 9db072d..3c91e7c 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -26,6 +26,7 @@
#include <linux/types.h>
#include <asm/byteorder.h>
#include <asm/memory.h>
+#include <asm/system.h>
/*
* ISA I/O bus memory addresses are 1:1 with the physical address.
@@ -191,6 +192,15 @@ extern void _memset_io(volatile void __iomem *, int, size_t);
#define writel_relaxed(v,c) ((void)__raw_writel((__force u32) \
cpu_to_le32(v),__mem_pci(c)))
+#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE
+#define readb(c) ({ u8 __v = readb_relaxed(c); rmb(); __v; })
+#define readw(c) ({ u16 __v = readw_relaxed(c); rmb(); __v; })
+#define readl(c) ({ u32 __v = readl_relaxed(c); rmb(); __v; })
+
+#define writeb(v,c) ({ wmb(); writeb_relaxed(v,c); })
+#define writew(v,c) ({ wmb(); writew_relaxed(v,c); })
+#define writel(v,c) ({ wmb(); writel_relaxed(v,c); })
+#else
#define readb(c) readb_relaxed(c)
#define readw(c) readw_relaxed(c)
#define readl(c) readl_relaxed(c)
@@ -198,6 +208,7 @@ extern void _memset_io(volatile void __iomem *, int, size_t);
#define writeb(v,c) writeb_relaxed(v,c)
#define writew(v,c) writew_relaxed(v,c)
#define writel(v,c) writel_relaxed(v,c)
+#endif
#define readsb(p,d,l) __raw_readsb(__mem_pci(p),d,l)
#define readsw(p,d,l) __raw_readsw(__mem_pci(p),d,l)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 8/9] ARM: Improve the L2 cache performance when PL310 is used
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (6 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 7/9] ARM: Add barriers to the I/O accessors if ARM_DMA_MEM_BUFFERABLE Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-19 13:44 ` [PATCH 9/9] ARM: Implement phys_mem_access_prot() to avoid attributes aliasing Catalin Marinas
2010-07-22 10:03 ` [PATCH 0/9] CM's patches for the next merging window(s) Russell King - ARM Linux
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
With this L2 cache controller, the cache maintenance by PA and sync
operations are atomic and do not require a "wait" loop. This patch
conditionally defines the cache_wait() function.
Since L2x0 cache controllers do not work with ARMv7 CPUs, the patch
automatically enables CACHE_PL310 when only CPU_V7 is defined.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/mm/Kconfig | 8 ++++++++
arch/arm/mm/cache-l2x0.c | 15 ++++++++++++---
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 294a57d..4e93c4a 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -797,6 +797,14 @@ config CACHE_L2X0
help
This option enables the L2x0 PrimeCell.
+config CACHE_PL310
+ bool
+ depends on CACHE_L2X0
+ default y if CPU_V7 && !CPU_V6
+ help
+ This option enables optimisations for the PL310 cache
+ controller.
+
config CACHE_TAUROS2
bool "Enable the Tauros2 L2 cache controller"
depends on (ARCH_DOVE || ARCH_MMP)
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 532c35c..b5c573e 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -29,13 +29,22 @@ static void __iomem *l2x0_base;
static DEFINE_SPINLOCK(l2x0_lock);
static uint32_t l2x0_way_mask; /* Bitmask of active ways */
-static inline void cache_wait(void __iomem *reg, unsigned long mask)
+static inline void cache_wait_way(void __iomem *reg, unsigned long mask)
{
- /* wait for the operation to complete */
+ /* wait for cache operation by line or way to complete */
while (readl_relaxed(reg) & mask)
;
}
+#ifdef CONFIG_CACHE_PL310
+static inline void cache_wait(void __iomem *reg, unsigned long mask)
+{
+ /* cache operations by line are atomic on PL310 */
+}
+#else
+#define cache_wait cache_wait_way
+#endif
+
static inline void cache_sync(void)
{
void __iomem *base = l2x0_base;
@@ -110,7 +119,7 @@ static inline void l2x0_inv_all(void)
/* invalidate all ways */
spin_lock_irqsave(&l2x0_lock, flags);
writel_relaxed(l2x0_way_mask, l2x0_base + L2X0_INV_WAY);
- cache_wait(l2x0_base + L2X0_INV_WAY, l2x0_way_mask);
+ cache_wait_way(l2x0_base + L2X0_INV_WAY, l2x0_way_mask);
cache_sync();
spin_unlock_irqrestore(&l2x0_lock, flags);
}
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 9/9] ARM: Implement phys_mem_access_prot() to avoid attributes aliasing
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (7 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 8/9] ARM: Improve the L2 cache performance when PL310 is used Catalin Marinas
@ 2010-07-19 13:44 ` Catalin Marinas
2010-07-22 10:03 ` [PATCH 0/9] CM's patches for the next merging window(s) Russell King - ARM Linux
9 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-19 13:44 UTC (permalink / raw)
To: linux-arm-kernel
ARMv7 onwards requires that there are no aliases to the same physical
location using different memory types (i.e. Normal vs Strongly Ordered).
Access to SO mappings when the unaligned accesses are handled in
hardware is also Unpredictable (pgprot_noncached() mappings in user
space).
The /dev/mem driver requires uncached mappings with O_SYNC. The patch
implements the phys_mem_access_prot() function which generates Strongly
Ordered memory attributes if !pfn_valid() (independent of O_SYNC) and
Normal Noncacheable (writecombine) if O_SYNC.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm/include/asm/pgtable.h | 3 +++
arch/arm/mm/mmu.c | 14 ++++++++++++++
2 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 42e694f..ea19775 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -337,6 +337,9 @@ static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE
#define pgprot_dmacoherent(prot) \
__pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_BUFFERABLE)
+#define __HAVE_PHYS_MEM_ACCESS_PROT
+extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+ unsigned long size, pgprot_t vma_prot);
#else
#define pgprot_dmacoherent(prot) \
__pgprot_modify(prot, L_PTE_MT_MASK|L_PTE_EXEC, L_PTE_MT_UNCACHED)
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 499e22d..e107f1b 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -15,6 +15,7 @@
#include <linux/mman.h>
#include <linux/nodemask.h>
#include <linux/sort.h>
+#include <linux/fs.h>
#include <asm/cputype.h>
#include <asm/mach-types.h>
@@ -487,6 +488,19 @@ static void __init build_mem_type_table(void)
}
}
+#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE
+pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+ unsigned long size, pgprot_t vma_prot)
+{
+ if (!pfn_valid(pfn))
+ return pgprot_noncached(vma_prot);
+ else if (file->f_flags & O_SYNC)
+ return pgprot_writecombine(vma_prot);
+ return vma_prot;
+}
+EXPORT_SYMBOL(phys_mem_access_prot);
+#endif
+
#define vectors_base() (vectors_high() ? 0xffff0000 : 0)
static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 0/9] CM's patches for the next merging window(s)
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
` (8 preceding siblings ...)
2010-07-19 13:44 ` [PATCH 9/9] ARM: Implement phys_mem_access_prot() to avoid attributes aliasing Catalin Marinas
@ 2010-07-22 10:03 ` Russell King - ARM Linux
2010-07-22 11:47 ` Catalin Marinas
9 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2010-07-22 10:03 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jul 19, 2010 at 02:43:57PM +0100, Catalin Marinas wrote:
> This is a series of patches I maintain in my tree at (the 'rebased'
> branch for individual patches or 'master' for a merge-friendly branch):
I think patches 2 through 9 are fine.
I think you and Tony need to discuss patch 1 - you point out that there's
an issue writing the TLS value after this change, and Tony's patches
change the way that the TLS code works. I'd like to hear that there's
no issue if both patches were to be merged.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/9] CM's patches for the next merging window(s)
2010-07-22 10:03 ` [PATCH 0/9] CM's patches for the next merging window(s) Russell King - ARM Linux
@ 2010-07-22 11:47 ` Catalin Marinas
0 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2010-07-22 11:47 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 2010-07-22 at 11:03 +0100, Russell King - ARM Linux wrote:
> On Mon, Jul 19, 2010 at 02:43:57PM +0100, Catalin Marinas wrote:
> > This is a series of patches I maintain in my tree at (the 'rebased'
> > branch for individual patches or 'master' for a merge-friendly branch):
>
> I think patches 2 through 9 are fine.
Thanks. I'll push them to -next. Since some of these patches would be
useful to merge in 2.6.36-rc1, please let me know which of them you'd
like in the patch system.
> I think you and Tony need to discuss patch 1 - you point out that there's
> an issue writing the TLS value after this change, and Tony's patches
> change the way that the TLS code works. I'd like to hear that there's
> no issue if both patches were to be merged.
Is Tony's patch going into mainline for 2.6.36-rc1? I can rebase my
domains patch on top of this. Apart from some merge conflicts I don't
see an issue with the implementation. I'll repost once rebased.
--
Catalin
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-07-22 11:47 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-19 13:43 [PATCH 0/9] CM's patches for the next merging window(s) Catalin Marinas
2010-07-19 13:44 ` [PATCH 1/9] ARM: Remove the domain switching on ARMv6k/v7 CPUs Catalin Marinas
2010-07-19 13:44 ` [PATCH 2/9] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
2010-07-19 13:44 ` [PATCH 3/9] ARM: Introduce __sync_icache_dcache() for VIPT caches Catalin Marinas
2010-07-19 13:44 ` [PATCH 4/9] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
2010-07-19 13:44 ` [PATCH 5/9] ARM: Introduce *_relaxed() I/O accessors Catalin Marinas
2010-07-19 13:44 ` [PATCH 6/9] ARM: Convert L2x0 to use the IO relaxed operations Catalin Marinas
2010-07-19 13:44 ` [PATCH 7/9] ARM: Add barriers to the I/O accessors if ARM_DMA_MEM_BUFFERABLE Catalin Marinas
2010-07-19 13:44 ` [PATCH 8/9] ARM: Improve the L2 cache performance when PL310 is used Catalin Marinas
2010-07-19 13:44 ` [PATCH 9/9] ARM: Implement phys_mem_access_prot() to avoid attributes aliasing Catalin Marinas
2010-07-22 10:03 ` [PATCH 0/9] CM's patches for the next merging window(s) Russell King - ARM Linux
2010-07-22 11:47 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).