* [PATCH 1/5] ARM: Allow lazy cache flushing via PG_arch_1 for highmem pages
2010-07-20 17:12 [PATCH 0/5] ARM: Cache maintenance changes Catalin Marinas
@ 2010-07-20 17:12 ` Catalin Marinas
2010-07-20 17:12 ` [PATCH 2/5] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2010-07-20 17:12 UTC (permalink / raw)
To: linux-arm-kernel
Commit d73cd42 forced non-lazy cache flushing of highmem pages in
flush_dcache_page(). This isn't needed since __flush_dcache_page()
(called lazily from update_mmu_cache) can handle highmem pages (fixed by
commit 7e5a69e).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
---
arch/arm/mm/flush.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index c6844cb..87dd5ff 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -247,7 +247,7 @@ void flush_dcache_page(struct page *page)
mapping = page_mapping(page);
#ifndef CONFIG_SMP
- if (!PageHighMem(page) && mapping && !mapping_mapped(mapping))
+ if (mapping && !mapping_mapped(mapping))
set_bit(PG_dcache_dirty, &page->flags);
else
#endif
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 2/5] ARM: Assume new page cache pages have dirty D-cache
2010-07-20 17:12 [PATCH 0/5] ARM: Cache maintenance changes Catalin Marinas
2010-07-20 17:12 ` [PATCH 1/5] ARM: Allow lazy cache flushing via PG_arch_1 for highmem pages Catalin Marinas
@ 2010-07-20 17:12 ` Catalin Marinas
2010-07-20 17:12 ` [PATCH 3/5] ARM: Introduce __sync_icache_dcache() for VIPT caches Catalin Marinas
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2010-07-20 17:12 UTC (permalink / raw)
To: linux-arm-kernel
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().
The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/cacheflush.h | 6 +++---
arch/arm/include/asm/tlbflush.h | 2 +-
arch/arm/mm/copypage-v4mc.c | 2 +-
arch/arm/mm/copypage-v6.c | 2 +-
arch/arm/mm/copypage-xscale.c | 2 +-
arch/arm/mm/dma-mapping.c | 6 ++++++
arch/arm/mm/fault-armv.c | 4 ++--
arch/arm/mm/flush.c | 3 ++-
8 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 4656a24..d3730f0 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -137,10 +137,10 @@
#endif
/*
- * This flag is used to indicate that the page pointed to by a pte
- * is dirty and requires cleaning before returning it to the user.
+ * This flag is used to indicate that the page pointed to by a pte is clean
+ * and does not require cleaning before returning it to the user.
*/
-#define PG_dcache_dirty PG_arch_1
+#define PG_dcache_clean PG_arch_1
/*
* MM Cache Management
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index bd863d8..40a7092 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -552,7 +552,7 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
#endif
/*
- * if PG_dcache_dirty is set for the page, we need to ensure that any
+ * If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
* back to the page.
*/
diff --git a/arch/arm/mm/copypage-v4mc.c b/arch/arm/mm/copypage-v4mc.c
index 598c51a..b806151 100644
--- a/arch/arm/mm/copypage-v4mc.c
+++ b/arch/arm/mm/copypage-v4mc.c
@@ -73,7 +73,7 @@ void v4_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index f55fa10..bdba6c6 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -79,7 +79,7 @@ static void v6_copy_user_highpage_aliasing(struct page *to,
unsigned int offset = CACHE_COLOUR(vaddr);
unsigned long kfrom, kto;
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
/* FIXME: not highmem safe */
diff --git a/arch/arm/mm/copypage-xscale.c b/arch/arm/mm/copypage-xscale.c
index 9920c0a..649bbcd 100644
--- a/arch/arm/mm/copypage-xscale.c
+++ b/arch/arm/mm/copypage-xscale.c
@@ -95,7 +95,7 @@ void xscale_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 9e7742f..fa3d07d 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -508,6 +508,12 @@ void ___dma_page_dev_to_cpu(struct page *page, unsigned long off,
outer_inv_range(paddr, paddr + size);
dma_cache_maint_page(page, off, size, dir, dmac_unmap_area);
+
+ /*
+ * Mark the D-cache clean for this page to avoid extra flushing.
+ */
+ if (dir != DMA_TO_DEVICE && off == 0 && size >= PAGE_SIZE)
+ set_bit(PG_dcache_clean, &page->flags);
}
EXPORT_SYMBOL(___dma_page_dev_to_cpu);
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 9b906de..58846cb 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -141,7 +141,7 @@ make_coherent(struct address_space *mapping, struct vm_area_struct *vma,
* a page table, or changing an existing PTE. Basically, there are two
* things that we need to take care of:
*
- * 1. If PG_dcache_dirty is set for the page, we need to ensure
+ * 1. If PG_dcache_clean is not set for the page, we need to ensure
* that any cache entries for the kernels virtual memory
* range are written back to the page.
* 2. If we have multiple shared mappings of the same space in
@@ -169,7 +169,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
mapping = page_mapping(page);
#ifndef CONFIG_SMP
- if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
#endif
if (mapping) {
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 87dd5ff..b4efce9 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -248,7 +248,7 @@ void flush_dcache_page(struct page *page)
#ifndef CONFIG_SMP
if (mapping && !mapping_mapped(mapping))
- set_bit(PG_dcache_dirty, &page->flags);
+ clear_bit(PG_dcache_clean, &page->flags);
else
#endif
{
@@ -257,6 +257,7 @@ void flush_dcache_page(struct page *page)
__flush_dcache_aliases(mapping, page);
else if (mapping)
__flush_icache_all();
+ set_bit(PG_dcache_clean, &page->flags);
}
}
EXPORT_SYMBOL(flush_dcache_page);
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 3/5] ARM: Introduce __sync_icache_dcache() for VIPT caches
2010-07-20 17:12 [PATCH 0/5] ARM: Cache maintenance changes Catalin Marinas
2010-07-20 17:12 ` [PATCH 1/5] ARM: Allow lazy cache flushing via PG_arch_1 for highmem pages Catalin Marinas
2010-07-20 17:12 ` [PATCH 2/5] ARM: Assume new page cache pages have dirty D-cache Catalin Marinas
@ 2010-07-20 17:12 ` Catalin Marinas
2010-07-20 17:12 ` [PATCH 4/5] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
2010-07-20 17:12 ` [PATCH 5/5] ARM: Remove superfluous flush_kernel_dcache_page() Catalin Marinas
4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2010-07-20 17:12 UTC (permalink / raw)
To: linux-arm-kernel
On SMP systems, there is a small chance of a PTE becoming visible to a
different CPU before the current cache maintenance operations in
update_mmu_cache(). To avoid this, cache maintenance must be handled in
set_pte_at() (similar to IA-64 and PowerPC).
This patch provides a unified VIPT cache handling mechanism and
implements the __sync_icache_dcache() function for ARMv6 onwards
architectures. It is called from set_pte_at() and replaces the
update_mmu_cache(). The latter is still used on VIVT hardware where a
vm_area_struct is required.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/pgtable.h | 26 +++++++++++++++++++++++---
arch/arm/include/asm/tlbflush.h | 10 +++++++++-
arch/arm/mm/fault-armv.c | 4 ++--
arch/arm/mm/flush.c | 30 ++++++++++++++++++++++++++++++
4 files changed, 64 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index ab68cf1..42e694f 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -278,9 +278,24 @@ extern struct page *empty_zero_page;
#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
-#define set_pte_at(mm,addr,ptep,pteval) do { \
- set_pte_ext(ptep, pteval, (addr) >= TASK_SIZE ? 0 : PTE_EXT_NG); \
- } while (0)
+#if __LINUX_ARM_ARCH__ < 6
+static inline void __sync_icache_dcache(pte_t pteval)
+{
+}
+#else
+extern void __sync_icache_dcache(pte_t pteval);
+#endif
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pteval)
+{
+ if (addr >= TASK_SIZE)
+ set_pte_ext(ptep, pteval, 0);
+ else {
+ __sync_icache_dcache(pteval);
+ set_pte_ext(ptep, pteval, PTE_EXT_NG);
+ }
+}
/*
* The following only work if pte_present() is true.
@@ -290,8 +305,13 @@ extern struct page *empty_zero_page;
#define pte_write(pte) (pte_val(pte) & L_PTE_WRITE)
#define pte_dirty(pte) (pte_val(pte) & L_PTE_DIRTY)
#define pte_young(pte) (pte_val(pte) & L_PTE_YOUNG)
+#define pte_exec(pte) (pte_val(pte) & L_PTE_EXEC)
#define pte_special(pte) (0)
+#define pte_present_user(pte) \
+ ((pte_val(pte) & (L_PTE_PRESENT | L_PTE_USER)) == \
+ (L_PTE_PRESENT | L_PTE_USER))
+
#define PTE_BIT_FUNC(fn,op) \
static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 40a7092..8ec4775 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -554,10 +554,18 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
/*
* If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
- * back to the page.
+ * back to the page. On ARMv6 and later, the cache coherency is handled via
+ * the set_pte_at() function.
*/
+#if __LINUX_ARM_ARCH__ < 6
extern void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep);
+#else
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+}
+#endif
#endif
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 58846cb..8440d95 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -28,6 +28,7 @@
static unsigned long shared_pte_mask = L_PTE_MT_BUFFERABLE;
+#if __LINUX_ARM_ARCH__ < 6
/*
* We take the easy way out of this problem - we make the
* PTE uncacheable. However, we leave the write buffer on.
@@ -168,10 +169,8 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
return;
mapping = page_mapping(page);
-#ifndef CONFIG_SMP
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
-#endif
if (mapping) {
if (cache_is_vivt())
make_coherent(mapping, vma, addr, ptep, pfn);
@@ -179,6 +178,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
__flush_icache_all();
}
}
+#endif /* __LINUX_ARM_ARCH__ < 6 */
/*
* Check whether the write buffer has physical address aliasing
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index b4efce9..dd5b012 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -215,6 +215,36 @@ static void __flush_dcache_aliases(struct address_space *mapping, struct page *p
flush_dcache_mmap_unlock(mapping);
}
+#if __LINUX_ARM_ARCH__ >= 6
+void __sync_icache_dcache(pte_t pteval)
+{
+ unsigned long pfn;
+ struct page *page;
+ struct address_space *mapping;
+
+ if (!pte_present_user(pteval))
+ return;
+ if (cache_is_vipt_nonaliasing() && !pte_exec(pteval))
+ /* only flush non-aliasing VIPT caches for exec mappings */
+ return;
+ pfn = pte_pfn(pteval);
+ if (!pfn_valid(pfn))
+ return;
+
+ page = pfn_to_page(pfn);
+ if (cache_is_vipt_aliasing())
+ mapping = page_mapping(page);
+ else
+ mapping = NULL;
+
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+ __flush_dcache_page(mapping, page);
+ /* pte_exec() already checked above for non-aliasing VIPT cache */
+ if (cache_is_vipt_nonaliasing() || pte_exec(pteval))
+ __flush_icache_all();
+}
+#endif
+
/*
* Ensure cache coherency between kernel mapping and userspace mapping
* of this page.
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 4/5] ARM: Use lazy cache flushing on ARMv7 SMP systems
2010-07-20 17:12 [PATCH 0/5] ARM: Cache maintenance changes Catalin Marinas
` (2 preceding siblings ...)
2010-07-20 17:12 ` [PATCH 3/5] ARM: Introduce __sync_icache_dcache() for VIPT caches Catalin Marinas
@ 2010-07-20 17:12 ` Catalin Marinas
2010-07-20 17:12 ` [PATCH 5/5] ARM: Remove superfluous flush_kernel_dcache_page() Catalin Marinas
4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2010-07-20 17:12 UTC (permalink / raw)
To: linux-arm-kernel
ARMv7 processors like Cortex-A9 broadcast the cache maintenance
operations in hardware. This patch allows the
flush_dcache_page/update_mmu_cache pair to work in lazy flushing mode
similar to the UP case.
Note that cache flushing on SMP systems now takes place via the
set_pte_at() call (__sync_icache_dcache) and there is no race with other
CPUs executing code from the new PTE before the cache flushing took
place.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/smp_plat.h | 4 ++++
arch/arm/mm/flush.c | 13 ++++---------
2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/smp_plat.h b/arch/arm/include/asm/smp_plat.h
index e621530..963a338 100644
--- a/arch/arm/include/asm/smp_plat.h
+++ b/arch/arm/include/asm/smp_plat.h
@@ -13,9 +13,13 @@ static inline int tlb_ops_need_broadcast(void)
return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) < 2;
}
+#if !defined(CONFIG_SMP) || __LINUX_ARM_ARCH__ >= 7
+#define cache_ops_need_broadcast() 0
+#else
static inline int cache_ops_need_broadcast(void)
{
return ((read_cpuid_ext(CPUID_EXT_MMFR3) >> 12) & 0xf) < 1;
}
+#endif
#endif
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index dd5b012..2332b77 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -17,6 +17,7 @@
#include <asm/smp_plat.h>
#include <asm/system.h>
#include <asm/tlbflush.h>
+#include <asm/smp_plat.h>
#include "mm.h"
@@ -93,12 +94,10 @@ void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsig
#define flush_pfn_alias(pfn,vaddr) do { } while (0)
#endif
-#ifdef CONFIG_SMP
static void flush_ptrace_access_other(void *args)
{
__flush_icache_all();
}
-#endif
static
void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
@@ -122,11 +121,9 @@ void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
if (vma->vm_flags & VM_EXEC) {
unsigned long addr = (unsigned long)kaddr;
__cpuc_coherent_kern_range(addr, addr + len);
-#ifdef CONFIG_SMP
if (cache_ops_need_broadcast())
smp_call_function(flush_ptrace_access_other,
NULL, 1);
-#endif
}
}
@@ -276,12 +273,10 @@ void flush_dcache_page(struct page *page)
mapping = page_mapping(page);
-#ifndef CONFIG_SMP
- if (mapping && !mapping_mapped(mapping))
+ if (!cache_ops_need_broadcast() &&
+ mapping && !mapping_mapped(mapping))
clear_bit(PG_dcache_clean, &page->flags);
- else
-#endif
- {
+ else {
__flush_dcache_page(mapping, page);
if (mapping && cache_is_vivt())
__flush_dcache_aliases(mapping, page);
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 5/5] ARM: Remove superfluous flush_kernel_dcache_page()
2010-07-20 17:12 [PATCH 0/5] ARM: Cache maintenance changes Catalin Marinas
` (3 preceding siblings ...)
2010-07-20 17:12 ` [PATCH 4/5] ARM: Use lazy cache flushing on ARMv7 SMP systems Catalin Marinas
@ 2010-07-20 17:12 ` Catalin Marinas
4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2010-07-20 17:12 UTC (permalink / raw)
To: linux-arm-kernel
Since page cache pages are now considered 'dirty' by default, the cache
flushing is handled via __flush_dcache_page() when a page gets mapped to
user space. Highmem pages on VIVT systems are flushed during kunmap()
and flush_kernel_dcache_page() was already a no-op in this case.
ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE is still defined since ARM needs
specific implementations for flush_kernel_vmap_range() and
invalidate_kernel_vmap_range().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/cacheflush.h | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index d3730f0..042e139 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -405,9 +405,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
static inline void flush_kernel_dcache_page(struct page *page)
{
- /* highmem pages are always flushed upon kunmap already */
- if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
- __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
}
#define flush_dcache_mmap_lock(mapping) \
^ permalink raw reply related [flat|nested] 6+ messages in thread