* [PATCH 0/4] MIPS cache & highmem fixes
@ 2016-03-01 2:37 Paul Burton
2016-03-01 2:37 ` [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page Paul Burton
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Paul Burton @ 2016-03-01 2:37 UTC (permalink / raw)
To: linux-mips
Cc: Paul Burton, Lars Persson, Steven J. Hill, Huacai Chen,
David Daney, Aneesh Kumar K.V, linux-kernel, Jerome Marchand,
Ralf Baechle, Kirill A. Shutemov, Andrew Morton
This series fixes up a few issues with our current cache maintenance
code, some specific to highmem & some not. It fixes an issue with icache
corruption seen on the pistachio SoC & Ci40, and icache corruption seen
using highmem on MIPS Malta boards with a P5600 CPU.
Applies atop v4.5-rc6. It would be great to squeeze these in for v4.5,
but I recognise it's quite late in the cycle & this brokenness has been
around for a while so won't object to v4.6.
Thanks,
Paul
Paul Burton (4):
MIPS: Flush dcache for flush_kernel_dcache_page
MIPS: Flush highmem pages in __flush_dcache_page
MIPS: Handle highmem pages in __update_cache
MIPS: Sync icache & dcache in set_pte_at
arch/mips/include/asm/cacheflush.h | 7 +------
arch/mips/include/asm/pgtable.h | 26 +++++++++++++++++++-----
arch/mips/mm/cache.c | 41 +++++++++++++++++++-------------------
3 files changed, 43 insertions(+), 31 deletions(-)
--
2.7.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page
2016-03-01 2:37 [PATCH 0/4] MIPS cache & highmem fixes Paul Burton
@ 2016-03-01 2:37 ` Paul Burton
2016-03-04 15:09 ` Lars Persson
2016-03-01 2:37 ` [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page Paul Burton
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Paul Burton @ 2016-03-01 2:37 UTC (permalink / raw)
To: linux-mips; +Cc: Paul Burton, Ralf Baechle, Lars Persson, linux-kernel
The flush_kernel_dcache_page function was previously essentially a nop.
This is incorrect for MIPS, where if a page has been modified & either
it aliases or it's executable & the icache doesn't fill from dcache then
the content needs to be written back from dcache to the next level of
the cache hierarchy (which is shared with the icache).
Implement this by simply calling flush_dcache_page, treating this
kmapped cache flush function (flush_kernel_dcache_page) exactly the same
as its non-kmapped counterpart (flush_dcache_page).
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---
arch/mips/include/asm/cacheflush.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index 723229f..7e9f468 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -132,6 +132,7 @@ static inline void kunmap_noncoherent(void)
static inline void flush_kernel_dcache_page(struct page *page)
{
BUG_ON(cpu_has_dc_aliases && PageHighMem(page));
+ flush_dcache_page(page);
}
/*
--
2.7.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page
2016-03-01 2:37 [PATCH 0/4] MIPS cache & highmem fixes Paul Burton
2016-03-01 2:37 ` [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page Paul Burton
@ 2016-03-01 2:37 ` Paul Burton
2016-03-29 8:35 ` Ralf Baechle
2016-03-01 2:37 ` [PATCH 3/4] MIPS: Handle highmem pages in __update_cache Paul Burton
[not found] ` <1456799879-14711-5-git-send-email-paul.burton@imgtec.com>
3 siblings, 1 reply; 14+ messages in thread
From: Paul Burton @ 2016-03-01 2:37 UTC (permalink / raw)
To: linux-mips
Cc: Paul Burton, Lars Persson, linux-kernel, Andrew Morton,
Kirill A. Shutemov, Ralf Baechle
When flush_dcache_page is called on an executable page, that page is
about to be provided to userland & we can presume that the icache
contains no valid entries for its address range. However if the icache
does not fill from the dcache then we cannot presume that the pages
content has been written back as far as the memories that the dcache
will fill from (ie. L2 or further out).
This was being done for lowmem pages, but not for highmem which can lead
to icache corruption. Fix this by mapping highmem pages & flushing their
content from the dcache in __flush_dcache_page before providing the page
to userland, just as is done for lowmem pages.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Lars Persson <lars.persson@axis.com>
---
arch/mips/mm/cache.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
index 3f159ca..5a67d8c 100644
--- a/arch/mips/mm/cache.c
+++ b/arch/mips/mm/cache.c
@@ -16,6 +16,7 @@
#include <linux/mm.h>
#include <asm/cacheflush.h>
+#include <asm/highmem.h>
#include <asm/processor.h>
#include <asm/cpu.h>
#include <asm/cpu-features.h>
@@ -83,8 +84,6 @@ void __flush_dcache_page(struct page *page)
struct address_space *mapping = page_mapping(page);
unsigned long addr;
- if (PageHighMem(page))
- return;
if (mapping && !mapping_mapped(mapping)) {
SetPageDcacheDirty(page);
return;
@@ -95,8 +94,15 @@ void __flush_dcache_page(struct page *page)
* case is for exec env/arg pages and those are %99 certainly going to
* get faulted into the tlb (and thus flushed) anyways.
*/
- addr = (unsigned long) page_address(page);
+ if (PageHighMem(page))
+ addr = (unsigned long)kmap_atomic(page);
+ else
+ addr = (unsigned long)page_address(page);
+
flush_data_cache_page(addr);
+
+ if (PageHighMem(page))
+ __kunmap_atomic((void *)addr);
}
EXPORT_SYMBOL(__flush_dcache_page);
--
2.7.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] MIPS: Handle highmem pages in __update_cache
2016-03-01 2:37 [PATCH 0/4] MIPS cache & highmem fixes Paul Burton
2016-03-01 2:37 ` [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page Paul Burton
2016-03-01 2:37 ` [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page Paul Burton
@ 2016-03-01 2:37 ` Paul Burton
2016-03-29 8:39 ` Ralf Baechle
[not found] ` <1456799879-14711-5-git-send-email-paul.burton@imgtec.com>
3 siblings, 1 reply; 14+ messages in thread
From: Paul Burton @ 2016-03-01 2:37 UTC (permalink / raw)
To: linux-mips
Cc: Paul Burton, Lars Persson, stable # v4 . 1+, linux-kernel,
Andrew Morton, Jerome Marchand, Kirill A. Shutemov, Ralf Baechle
The following patch will expose __update_cache to highmem pages. Handle
them by mapping them in for the duration of the cache maintenance, just
like in __flush_dcache_page. The code for that isn't shared because we
need the page address in __update_cache so sharing became messy. Given
that the entirity is an extra 5 lines, just duplicate it.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Lars Persson <lars.persson@axis.com>
Cc: stable <stable@vger.kernel.org> # v4.1+
---
arch/mips/mm/cache.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
index 5a67d8c..8befa55 100644
--- a/arch/mips/mm/cache.c
+++ b/arch/mips/mm/cache.c
@@ -149,9 +149,17 @@ void __update_cache(struct vm_area_struct *vma, unsigned long address,
return;
page = pfn_to_page(pfn);
if (page_mapping(page) && Page_dcache_dirty(page)) {
- addr = (unsigned long) page_address(page);
+ if (PageHighMem(page))
+ addr = (unsigned long)kmap_atomic(page);
+ else
+ addr = (unsigned long)page_address(page);
+
if (exec || pages_do_alias(addr, address & PAGE_MASK))
flush_data_cache_page(addr);
+
+ if (PageHighMem(page))
+ __kunmap_atomic((void *)addr);
+
ClearPageDcacheDirty(page);
}
}
--
2.7.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] MIPS: Sync icache & dcache in set_pte_at
[not found] ` <20160301171940.GA26791@NP-P-BURTON>
@ 2016-03-02 14:12 ` Ralf Baechle
0 siblings, 0 replies; 14+ messages in thread
From: Ralf Baechle @ 2016-03-02 14:12 UTC (permalink / raw)
To: Paul Burton
Cc: David Daney, linux-mips, Lars Persson, stable # v4 . 1+,
Steven J. Hill, David Daney, Huacai Chen, Aneesh Kumar K.V,
linux-kernel, Andrew Morton, Jerome Marchand, Kirill A. Shutemov
On Tue, Mar 01, 2016 at 05:19:40PM +0000, Paul Burton wrote:
> > >+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> > >+ pte_t *ptep, pte_t pteval)
> > >+{
> > >+ extern void __update_cache(unsigned long address, pte_t pte);
> > >+
> > >+ if (!pte_present(pteval))
> > >+ goto cache_sync_done;
> > >+
> > >+ if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
> > >+ goto cache_sync_done;
> > >+
> > >+ __update_cache(addr, pteval);
> > >+cache_sync_done:
> > >+ set_pte(ptep, pteval);
> > >+}
> > >+
> >
> > This seems crazy.
>
> Perhaps, but also correct...
>
> > I don't think any other architecture does this type of work in set_pte_at().
>
> Yes they do. As mentioned in the commit message see arm, ia64 or powerpc
> for architectures that all do the same sort of thing in set_pte_at.
>
> > Can you look into finding a better way?
>
> Not that I can see.
FYI, this is the original commit message adding set_pte_at(). The commit
predates Linus' git history but is in the various history trees, for example
https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=ae3d0a847f4b38812241e4a5dc3371965c752a8c
Or for your convenience:
commit ae3d0a847f4b38812241e4a5dc3371965c752a8c
Author: David S. Miller <davem@nuts.davemloft.net>
Date: Tue Feb 22 23:42:56 2005 -0800
[MM]: Add set_pte_at() which takes 'mm' and 'addr' args.
I'm taking a slightly different approach this time around so things
are easier to integrate. Here is the first patch which builds the
infrastructure. Basically:
1) Add set_pte_at() which is set_pte() with 'mm' and 'addr' arguments
added. All generic code uses set_pte_at().
Most platforms simply get this define:
#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
I chose this method over simply changing all set_pte() call sites
because many platforms implement this in assembler and it would
take forever to preserve the build and stabilize things if modifying
that was necessary.
Soon, with platform maintainer's help, we can kill of set_pte() entirely.
To be honest, there are only a handful of set_pte() call sites in the
arch specific code.
Actually, in this patch ppc64 is completely set_pte() free and does not
define it.
2) pte_clear() gets 'mm' and 'addr' arguments now.
This had a cascading effect on many ptep_test_and_*() routines. Specifically:
a) ptep_test_and_clear_{young,dirty}() now take 'vma' and 'address' args.
b) ptep_get_and_clear now take 'mm' and 'address' args.
c) ptep_mkdirty was deleted, unused by any code.
d) ptep_set_wrprotect now takes 'mm' and 'address' args.
I've tested this patch as follows:
1) compile and run tested on sparc64/SMP
2) compile tested on:
a) ppc64/SMP
b) i386 both with and without PAE enabled
Signed-off-by: David S. Miller <davem@davemloft.net>
Which doesn't specify if the function is meant for cache management nor
is it documented in Documentation/cachetlb.txt.
Ralf
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [4/4] MIPS: Sync icache & dcache in set_pte_at
[not found] ` <1456799879-14711-5-git-send-email-paul.burton@imgtec.com>
[not found] ` <56D5CDB3.80407@caviumnetworks.com>
@ 2016-03-03 3:03 ` Leonid Yegoshin
2016-03-04 19:02 ` [PATCH 4/4] " Lars Persson
2 siblings, 0 replies; 14+ messages in thread
From: Leonid Yegoshin @ 2016-03-03 3:03 UTC (permalink / raw)
To: Paul Burton, linux-mips
Cc: Lars Persson, stable # v4 . 1+, Steven J. Hill, David Daney,
Huacai Chen, Aneesh Kumar K.V, linux-kernel, Andrew Morton,
Jerome Marchand, Kirill A. Shutemov, Ralf Baechle
Paul Burton wrote:
> It is, however, used in such a way by others & seems to me like the only
> correct way to implement the lazy cache flushing. The alternative would
> be to adjust all generic code to ensure flush_icache_page gets called
> before set_pte_at
... which is an exact case right now. Both calls of flush_icache_page() are:
1) do_swap_page()
flush_icache_page(vma, page);
if (pte_swp_soft_dirty(orig_pte))
pte = pte_mksoft_dirty(pte);
set_pte_at(mm, address, page_table, pte);
...
2) do_set_pte()
flush_icache_page(vma, page);
entry = mk_pte(page, vma->vm_page_prot);
if (write)
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
if (anon) {
inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
page_add_new_anon_rmap(page, vma, address);
} else {
inc_mm_counter_fast(vma->vm_mm, MM_FILEPAGES);
page_add_file_rmap(page);
}
set_pte_at(vma->vm_mm, address, pte, entry);
/* no need to invalidate: a not-present page won't be cached */
update_mmu_cache(vma, address, pte);
....
You should only be sure that flush_icache_page() completes a lazy
D-cache flush.
- Leonid.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page
2016-03-01 2:37 ` [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page Paul Burton
@ 2016-03-04 15:09 ` Lars Persson
2016-03-29 8:29 ` Ralf Baechle
0 siblings, 1 reply; 14+ messages in thread
From: Lars Persson @ 2016-03-04 15:09 UTC (permalink / raw)
To: Paul Burton
Cc: linux-mips@linux-mips.org, Ralf Baechle,
linux-kernel@vger.kernel.org
> 1 mars 2016 kl. 03:38 Paul Burton <paul.burton@imgtec.com>:
>
> The flush_kernel_dcache_page function was previously essentially a nop.
> This is incorrect for MIPS, where if a page has been modified & either
> it aliases or it's executable & the icache doesn't fill from dcache then
> the content needs to be written back from dcache to the next level of
> the cache hierarchy (which is shared with the icache).
>
> Implement this by simply calling flush_dcache_page, treating this
> kmapped cache flush function (flush_kernel_dcache_page) exactly the same
> as its non-kmapped counterpart (flush_dcache_page).
>
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> ---
>
> arch/mips/include/asm/cacheflush.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
> index 723229f..7e9f468 100644
> --- a/arch/mips/include/asm/cacheflush.h
> +++ b/arch/mips/include/asm/cacheflush.h
> @@ -132,6 +132,7 @@ static inline void kunmap_noncoherent(void)
> static inline void flush_kernel_dcache_page(struct page *page)
> {
> BUG_ON(cpu_has_dc_aliases && PageHighMem(page));
> + flush_dcache_page(page);
Should we use instead __flush_dcache_page() that flushes immediately for mapped pages ? Steven J Hill's old patch set for highmem had done it like this.
> }
>
> /*
> --
> 2.7.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] MIPS: Sync icache & dcache in set_pte_at
[not found] ` <1456799879-14711-5-git-send-email-paul.burton@imgtec.com>
[not found] ` <56D5CDB3.80407@caviumnetworks.com>
2016-03-03 3:03 ` [4/4] " Leonid Yegoshin
@ 2016-03-04 19:02 ` Lars Persson
2016-03-05 0:21 ` Paul Burton
2 siblings, 1 reply; 14+ messages in thread
From: Lars Persson @ 2016-03-04 19:02 UTC (permalink / raw)
To: Paul Burton
Cc: linux-mips@linux-mips.org, stable # v4 . 1+, Steven J. Hill,
David Daney, Huacai Chen, Aneesh Kumar K.V,
linux-kernel@vger.kernel.org, Andrew Morton, Jerome Marchand,
Kirill A. Shutemov, Ralf Baechle
Hi
Some further thoughts on the matter. You have so far not showed a valid example of a race condition. The two examples you give in the commit message are for a _single_ thread existing in the address space (fork and execve).
BR,
Lars
> 1 mars 2016 kl. 03:39 skrev Paul Burton <paul.burton@imgtec.com>:
>
> It's possible for pages to become visible prior to update_mmu_cache
> running if a thread within the same address space preempts the current
> thread or runs simultaneously on another CPU. That is, the following
> scenario is possible:
>
> CPU0 CPU1
>
> write to page
> flush_dcache_page
> flush_icache_page
> set_pte_at
> map page
> update_mmu_cache
>
> If CPU1 maps the page in between CPU0's set_pte_at, which marks it valid
> & visible, and update_mmu_cache where the dcache flush occurs then CPU1s
> icache will fill from stale data (unless it fills from the dcache, in
> which case all is good, but most MIPS CPUs don't have this property).
> Commit 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
> attempted to fix that by performing the dcache flush in
> flush_icache_page such that it occurs before the set_pte_at call makes
> the page visible. However it has the problem that not all code that
> writes to pages exposed to userland call flush_icache_page. There are
> many callers of set_pte_at under mm/ and only 2 of them do call
> flush_icache_page. Thus the race window between a page becoming visible
> & being coherent between the icache & dcache remains open in some cases.
>
> To illustrate some of the cases, a WARN was added to __update_cache with
> this patch applied that triggered in cases where a page about to be
> flushed from the dcache was not the last page provided to
> flush_icache_page. That is, backtraces were obtained for cases in which
> the race window is left open without this patch. The 2 standout examples
> follow.
>
> When forking a process:
>
> [ 15.271842] [<80417630>] __update_cache+0xcc/0x188
> [ 15.277274] [<80530394>] copy_page_range+0x56c/0x6ac
> [ 15.282861] [<8042936c>] copy_process.part.54+0xd40/0x17ac
> [ 15.289028] [<80429f80>] do_fork+0xe4/0x420
> [ 15.293747] [<80413808>] handle_sys+0x128/0x14c
>
> When exec'ing an ELF binary:
>
> [ 14.445964] [<80417630>] __update_cache+0xcc/0x188
> [ 14.451369] [<80538d88>] move_page_tables+0x414/0x498
> [ 14.457075] [<8055d848>] setup_arg_pages+0x220/0x318
> [ 14.462685] [<805b0f38>] load_elf_binary+0x530/0x12a0
> [ 14.468374] [<8055ec3c>] search_binary_handler+0xbc/0x214
> [ 14.474444] [<8055f6c0>] do_execveat_common+0x43c/0x67c
> [ 14.480324] [<8055f938>] do_execve+0x38/0x44
> [ 14.485137] [<80413808>] handle_sys+0x128/0x14c
>
> These code paths write into a page, call flush_dcache_page then call
> set_pte_at without flush_icache_page inbetween. The end result is that
> the icache can become corrupted & userland processes may execute
> unexpected or invalid code, typically resulting in a reserved
> instruction exception, a trap or a segfault.
>
> Fix this race condition fully by performing any cache maintenance
> required to keep the icache & dcache in sync in set_pte_at, before the
> page is made valid. This has the added bonus of ensuring the cache
> maintenance always happens in one location, rather than being duplicated
> in flush_icache_page & update_mmu_cache. It also matches the way other
> architectures solve the same problem (see arm, ia64 & powerpc).
>
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> Reported-by: Ionela Voinescu <ionela.voinescu@imgtec.com>
> Cc: Lars Persson <lars.persson@axis.com>
> Cc: stable <stable@vger.kernel.org> # v4.1+
> Fixes: 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
>
> ---
>
> arch/mips/include/asm/cacheflush.h | 6 ------
> arch/mips/include/asm/pgtable.h | 26 +++++++++++++++++++++-----
> arch/mips/mm/cache.c | 19 +++----------------
> 3 files changed, 24 insertions(+), 27 deletions(-)
>
> diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
> index 7e9f468..34ed22e 100644
> --- a/arch/mips/include/asm/cacheflush.h
> +++ b/arch/mips/include/asm/cacheflush.h
> @@ -51,7 +51,6 @@ extern void (*flush_cache_range)(struct vm_area_struct *vma,
> unsigned long start, unsigned long end);
> extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
> extern void __flush_dcache_page(struct page *page);
> -extern void __flush_icache_page(struct vm_area_struct *vma, struct page *page);
>
> #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
> static inline void flush_dcache_page(struct page *page)
> @@ -77,11 +76,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
> static inline void flush_icache_page(struct vm_area_struct *vma,
> struct page *page)
> {
> - if (!cpu_has_ic_fills_f_dc && (vma->vm_flags & VM_EXEC) &&
> - Page_dcache_dirty(page)) {
> - __flush_icache_page(vma, page);
> - ClearPageDcacheDirty(page);
> - }
> }
>
> extern void (*flush_icache_range)(unsigned long start, unsigned long end);
> diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
> index 9a4fe01..65bf2c0 100644
> --- a/arch/mips/include/asm/pgtable.h
> +++ b/arch/mips/include/asm/pgtable.h
> @@ -127,10 +127,14 @@ do { \
> } \
> } while(0)
>
> +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep, pte_t pteval);
> +
> #if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
>
> #define pte_none(pte) (!(((pte).pte_high) & ~_PAGE_GLOBAL))
> #define pte_present(pte) ((pte).pte_low & _PAGE_PRESENT)
> +#define pte_no_exec(pte) ((pte).pte_low & _PAGE_NO_EXEC)
>
> static inline void set_pte(pte_t *ptep, pte_t pte)
> {
> @@ -148,7 +152,6 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
> buddy->pte_high |= _PAGE_GLOBAL;
> }
> }
> -#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
>
> static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> @@ -166,6 +169,7 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
>
> #define pte_none(pte) (!(pte_val(pte) & ~_PAGE_GLOBAL))
> #define pte_present(pte) (pte_val(pte) & _PAGE_PRESENT)
> +#define pte_no_exec(pte) (pte_val(pte) & _PAGE_NO_EXEC)
>
> /*
> * Certain architectures need to do special things when pte's
> @@ -218,7 +222,6 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)
> }
> #endif
> }
> -#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
>
> static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> @@ -234,6 +237,22 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
> }
> #endif
>
> +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep, pte_t pteval)
> +{
> + extern void __update_cache(unsigned long address, pte_t pte);
> +
> + if (!pte_present(pteval))
> + goto cache_sync_done;
> +
> + if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
> + goto cache_sync_done;
> +
> + __update_cache(addr, pteval);
> +cache_sync_done:
> + set_pte(ptep, pteval);
> +}
> +
> /*
> * (pmds are folded into puds so this doesn't get actually called,
> * but the define is needed for a generic inline function.)
> @@ -430,15 +449,12 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>
> extern void __update_tlb(struct vm_area_struct *vma, unsigned long address,
> pte_t pte);
> -extern void __update_cache(struct vm_area_struct *vma, unsigned long address,
> - pte_t pte);
>
> static inline void update_mmu_cache(struct vm_area_struct *vma,
> unsigned long address, pte_t *ptep)
> {
> pte_t pte = *ptep;
> __update_tlb(vma, address, pte);
> - __update_cache(vma, address, pte);
> }
>
> static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
> diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
> index 8befa55..bf04c6c 100644
> --- a/arch/mips/mm/cache.c
> +++ b/arch/mips/mm/cache.c
> @@ -125,30 +125,17 @@ void __flush_anon_page(struct page *page, unsigned long vmaddr)
>
> EXPORT_SYMBOL(__flush_anon_page);
>
> -void __flush_icache_page(struct vm_area_struct *vma, struct page *page)
> -{
> - unsigned long addr;
> -
> - if (PageHighMem(page))
> - return;
> -
> - addr = (unsigned long) page_address(page);
> - flush_data_cache_page(addr);
> -}
> -EXPORT_SYMBOL_GPL(__flush_icache_page);
> -
> -void __update_cache(struct vm_area_struct *vma, unsigned long address,
> - pte_t pte)
> +void __update_cache(unsigned long address, pte_t pte)
> {
> struct page *page;
> unsigned long pfn, addr;
> - int exec = (vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc;
> + int exec = !pte_no_exec(pte) && !cpu_has_ic_fills_f_dc;
>
> pfn = pte_pfn(pte);
> if (unlikely(!pfn_valid(pfn)))
> return;
> page = pfn_to_page(pfn);
> - if (page_mapping(page) && Page_dcache_dirty(page)) {
> + if (Page_dcache_dirty(page)) {
> if (PageHighMem(page))
> addr = (unsigned long)kmap_atomic(page);
> else
> --
> 2.7.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] MIPS: Sync icache & dcache in set_pte_at
2016-03-04 19:02 ` [PATCH 4/4] " Lars Persson
@ 2016-03-05 0:21 ` Paul Burton
2016-03-05 0:27 ` Paul Burton
0 siblings, 1 reply; 14+ messages in thread
From: Paul Burton @ 2016-03-05 0:21 UTC (permalink / raw)
To: Lars Persson
Cc: linux-mips@linux-mips.org, stable # v4 . 1+, Steven J. Hill,
David Daney, Huacai Chen, Aneesh Kumar K.V,
linux-kernel@vger.kernel.org, Andrew Morton, Jerome Marchand,
Kirill A. Shutemov, Ralf Baechle
On Fri, Mar 04, 2016 at 07:02:24PM +0000, Lars Persson wrote:
> Hi
>
> Some further thoughts on the matter. You have so far not showed a
> valid example of a race condition. The two examples you give in the
> commit message are for a _single_ thread existing in the address space
> (fork and execve).
Hi Lars,
Neither fork nor exec are limited to a single thread existing in the
address space - I'm not sure what you're saying? fork by its very
definition results in 2.
Thanks,
Paul
> BR,
> Lars
>
> > 1 mars 2016 kl. 03:39 skrev Paul Burton <paul.burton@imgtec.com>:
> >
> > It's possible for pages to become visible prior to update_mmu_cache
> > running if a thread within the same address space preempts the current
> > thread or runs simultaneously on another CPU. That is, the following
> > scenario is possible:
> >
> > CPU0 CPU1
> >
> > write to page
> > flush_dcache_page
> > flush_icache_page
> > set_pte_at
> > map page
> > update_mmu_cache
> >
> > If CPU1 maps the page in between CPU0's set_pte_at, which marks it valid
> > & visible, and update_mmu_cache where the dcache flush occurs then CPU1s
> > icache will fill from stale data (unless it fills from the dcache, in
> > which case all is good, but most MIPS CPUs don't have this property).
> > Commit 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
> > attempted to fix that by performing the dcache flush in
> > flush_icache_page such that it occurs before the set_pte_at call makes
> > the page visible. However it has the problem that not all code that
> > writes to pages exposed to userland call flush_icache_page. There are
> > many callers of set_pte_at under mm/ and only 2 of them do call
> > flush_icache_page. Thus the race window between a page becoming visible
> > & being coherent between the icache & dcache remains open in some cases.
> >
> > To illustrate some of the cases, a WARN was added to __update_cache with
> > this patch applied that triggered in cases where a page about to be
> > flushed from the dcache was not the last page provided to
> > flush_icache_page. That is, backtraces were obtained for cases in which
> > the race window is left open without this patch. The 2 standout examples
> > follow.
> >
> > When forking a process:
> >
> > [ 15.271842] [<80417630>] __update_cache+0xcc/0x188
> > [ 15.277274] [<80530394>] copy_page_range+0x56c/0x6ac
> > [ 15.282861] [<8042936c>] copy_process.part.54+0xd40/0x17ac
> > [ 15.289028] [<80429f80>] do_fork+0xe4/0x420
> > [ 15.293747] [<80413808>] handle_sys+0x128/0x14c
> >
> > When exec'ing an ELF binary:
> >
> > [ 14.445964] [<80417630>] __update_cache+0xcc/0x188
> > [ 14.451369] [<80538d88>] move_page_tables+0x414/0x498
> > [ 14.457075] [<8055d848>] setup_arg_pages+0x220/0x318
> > [ 14.462685] [<805b0f38>] load_elf_binary+0x530/0x12a0
> > [ 14.468374] [<8055ec3c>] search_binary_handler+0xbc/0x214
> > [ 14.474444] [<8055f6c0>] do_execveat_common+0x43c/0x67c
> > [ 14.480324] [<8055f938>] do_execve+0x38/0x44
> > [ 14.485137] [<80413808>] handle_sys+0x128/0x14c
> >
> > These code paths write into a page, call flush_dcache_page then call
> > set_pte_at without flush_icache_page inbetween. The end result is that
> > the icache can become corrupted & userland processes may execute
> > unexpected or invalid code, typically resulting in a reserved
> > instruction exception, a trap or a segfault.
> >
> > Fix this race condition fully by performing any cache maintenance
> > required to keep the icache & dcache in sync in set_pte_at, before the
> > page is made valid. This has the added bonus of ensuring the cache
> > maintenance always happens in one location, rather than being duplicated
> > in flush_icache_page & update_mmu_cache. It also matches the way other
> > architectures solve the same problem (see arm, ia64 & powerpc).
> >
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Reported-by: Ionela Voinescu <ionela.voinescu@imgtec.com>
> > Cc: Lars Persson <lars.persson@axis.com>
> > Cc: stable <stable@vger.kernel.org> # v4.1+
> > Fixes: 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
> >
> > ---
> >
> > arch/mips/include/asm/cacheflush.h | 6 ------
> > arch/mips/include/asm/pgtable.h | 26 +++++++++++++++++++++-----
> > arch/mips/mm/cache.c | 19 +++----------------
> > 3 files changed, 24 insertions(+), 27 deletions(-)
> >
> > diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
> > index 7e9f468..34ed22e 100644
> > --- a/arch/mips/include/asm/cacheflush.h
> > +++ b/arch/mips/include/asm/cacheflush.h
> > @@ -51,7 +51,6 @@ extern void (*flush_cache_range)(struct vm_area_struct *vma,
> > unsigned long start, unsigned long end);
> > extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
> > extern void __flush_dcache_page(struct page *page);
> > -extern void __flush_icache_page(struct vm_area_struct *vma, struct page *page);
> >
> > #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
> > static inline void flush_dcache_page(struct page *page)
> > @@ -77,11 +76,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
> > static inline void flush_icache_page(struct vm_area_struct *vma,
> > struct page *page)
> > {
> > - if (!cpu_has_ic_fills_f_dc && (vma->vm_flags & VM_EXEC) &&
> > - Page_dcache_dirty(page)) {
> > - __flush_icache_page(vma, page);
> > - ClearPageDcacheDirty(page);
> > - }
> > }
> >
> > extern void (*flush_icache_range)(unsigned long start, unsigned long end);
> > diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
> > index 9a4fe01..65bf2c0 100644
> > --- a/arch/mips/include/asm/pgtable.h
> > +++ b/arch/mips/include/asm/pgtable.h
> > @@ -127,10 +127,14 @@ do { \
> > } \
> > } while(0)
> >
> > +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> > + pte_t *ptep, pte_t pteval);
> > +
> > #if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
> >
> > #define pte_none(pte) (!(((pte).pte_high) & ~_PAGE_GLOBAL))
> > #define pte_present(pte) ((pte).pte_low & _PAGE_PRESENT)
> > +#define pte_no_exec(pte) ((pte).pte_low & _PAGE_NO_EXEC)
> >
> > static inline void set_pte(pte_t *ptep, pte_t pte)
> > {
> > @@ -148,7 +152,6 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
> > buddy->pte_high |= _PAGE_GLOBAL;
> > }
> > }
> > -#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
> >
> > static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> > {
> > @@ -166,6 +169,7 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
> >
> > #define pte_none(pte) (!(pte_val(pte) & ~_PAGE_GLOBAL))
> > #define pte_present(pte) (pte_val(pte) & _PAGE_PRESENT)
> > +#define pte_no_exec(pte) (pte_val(pte) & _PAGE_NO_EXEC)
> >
> > /*
> > * Certain architectures need to do special things when pte's
> > @@ -218,7 +222,6 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)
> > }
> > #endif
> > }
> > -#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
> >
> > static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> > {
> > @@ -234,6 +237,22 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
> > }
> > #endif
> >
> > +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> > + pte_t *ptep, pte_t pteval)
> > +{
> > + extern void __update_cache(unsigned long address, pte_t pte);
> > +
> > + if (!pte_present(pteval))
> > + goto cache_sync_done;
> > +
> > + if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
> > + goto cache_sync_done;
> > +
> > + __update_cache(addr, pteval);
> > +cache_sync_done:
> > + set_pte(ptep, pteval);
> > +}
> > +
> > /*
> > * (pmds are folded into puds so this doesn't get actually called,
> > * but the define is needed for a generic inline function.)
> > @@ -430,15 +449,12 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
> >
> > extern void __update_tlb(struct vm_area_struct *vma, unsigned long address,
> > pte_t pte);
> > -extern void __update_cache(struct vm_area_struct *vma, unsigned long address,
> > - pte_t pte);
> >
> > static inline void update_mmu_cache(struct vm_area_struct *vma,
> > unsigned long address, pte_t *ptep)
> > {
> > pte_t pte = *ptep;
> > __update_tlb(vma, address, pte);
> > - __update_cache(vma, address, pte);
> > }
> >
> > static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
> > diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
> > index 8befa55..bf04c6c 100644
> > --- a/arch/mips/mm/cache.c
> > +++ b/arch/mips/mm/cache.c
> > @@ -125,30 +125,17 @@ void __flush_anon_page(struct page *page, unsigned long vmaddr)
> >
> > EXPORT_SYMBOL(__flush_anon_page);
> >
> > -void __flush_icache_page(struct vm_area_struct *vma, struct page *page)
> > -{
> > - unsigned long addr;
> > -
> > - if (PageHighMem(page))
> > - return;
> > -
> > - addr = (unsigned long) page_address(page);
> > - flush_data_cache_page(addr);
> > -}
> > -EXPORT_SYMBOL_GPL(__flush_icache_page);
> > -
> > -void __update_cache(struct vm_area_struct *vma, unsigned long address,
> > - pte_t pte)
> > +void __update_cache(unsigned long address, pte_t pte)
> > {
> > struct page *page;
> > unsigned long pfn, addr;
> > - int exec = (vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc;
> > + int exec = !pte_no_exec(pte) && !cpu_has_ic_fills_f_dc;
> >
> > pfn = pte_pfn(pte);
> > if (unlikely(!pfn_valid(pfn)))
> > return;
> > page = pfn_to_page(pfn);
> > - if (page_mapping(page) && Page_dcache_dirty(page)) {
> > + if (Page_dcache_dirty(page)) {
> > if (PageHighMem(page))
> > addr = (unsigned long)kmap_atomic(page);
> > else
> > --
> > 2.7.1
> >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] MIPS: Sync icache & dcache in set_pte_at
2016-03-05 0:21 ` Paul Burton
@ 2016-03-05 0:27 ` Paul Burton
0 siblings, 0 replies; 14+ messages in thread
From: Paul Burton @ 2016-03-05 0:27 UTC (permalink / raw)
To: Lars Persson
Cc: linux-mips@linux-mips.org, stable # v4 . 1+, Steven J. Hill,
David Daney, Huacai Chen, Aneesh Kumar K.V,
linux-kernel@vger.kernel.org, Andrew Morton, Jerome Marchand,
Kirill A. Shutemov, Ralf Baechle
On Sat, Mar 05, 2016 at 12:21:54AM +0000, Paul Burton wrote:
> On Fri, Mar 04, 2016 at 07:02:24PM +0000, Lars Persson wrote:
> > Hi
> >
> > Some further thoughts on the matter. You have so far not showed a
> > valid example of a race condition. The two examples you give in the
> > commit message are for a _single_ thread existing in the address space
> > (fork and execve).
>
> Hi Lars,
>
> Neither fork nor exec are limited to a single thread existing in the
> address space - I'm not sure what you're saying? fork by its very
> definition results in 2.
Ok, exec kinda is (it's late...). Still, fork clearly isn't.
Thanks,
Paul
> Thanks,
> Paul
>
> > BR,
> > Lars
> >
> > > 1 mars 2016 kl. 03:39 skrev Paul Burton <paul.burton@imgtec.com>:
> > >
> > > It's possible for pages to become visible prior to update_mmu_cache
> > > running if a thread within the same address space preempts the current
> > > thread or runs simultaneously on another CPU. That is, the following
> > > scenario is possible:
> > >
> > > CPU0 CPU1
> > >
> > > write to page
> > > flush_dcache_page
> > > flush_icache_page
> > > set_pte_at
> > > map page
> > > update_mmu_cache
> > >
> > > If CPU1 maps the page in between CPU0's set_pte_at, which marks it valid
> > > & visible, and update_mmu_cache where the dcache flush occurs then CPU1s
> > > icache will fill from stale data (unless it fills from the dcache, in
> > > which case all is good, but most MIPS CPUs don't have this property).
> > > Commit 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
> > > attempted to fix that by performing the dcache flush in
> > > flush_icache_page such that it occurs before the set_pte_at call makes
> > > the page visible. However it has the problem that not all code that
> > > writes to pages exposed to userland call flush_icache_page. There are
> > > many callers of set_pte_at under mm/ and only 2 of them do call
> > > flush_icache_page. Thus the race window between a page becoming visible
> > > & being coherent between the icache & dcache remains open in some cases.
> > >
> > > To illustrate some of the cases, a WARN was added to __update_cache with
> > > this patch applied that triggered in cases where a page about to be
> > > flushed from the dcache was not the last page provided to
> > > flush_icache_page. That is, backtraces were obtained for cases in which
> > > the race window is left open without this patch. The 2 standout examples
> > > follow.
> > >
> > > When forking a process:
> > >
> > > [ 15.271842] [<80417630>] __update_cache+0xcc/0x188
> > > [ 15.277274] [<80530394>] copy_page_range+0x56c/0x6ac
> > > [ 15.282861] [<8042936c>] copy_process.part.54+0xd40/0x17ac
> > > [ 15.289028] [<80429f80>] do_fork+0xe4/0x420
> > > [ 15.293747] [<80413808>] handle_sys+0x128/0x14c
> > >
> > > When exec'ing an ELF binary:
> > >
> > > [ 14.445964] [<80417630>] __update_cache+0xcc/0x188
> > > [ 14.451369] [<80538d88>] move_page_tables+0x414/0x498
> > > [ 14.457075] [<8055d848>] setup_arg_pages+0x220/0x318
> > > [ 14.462685] [<805b0f38>] load_elf_binary+0x530/0x12a0
> > > [ 14.468374] [<8055ec3c>] search_binary_handler+0xbc/0x214
> > > [ 14.474444] [<8055f6c0>] do_execveat_common+0x43c/0x67c
> > > [ 14.480324] [<8055f938>] do_execve+0x38/0x44
> > > [ 14.485137] [<80413808>] handle_sys+0x128/0x14c
> > >
> > > These code paths write into a page, call flush_dcache_page then call
> > > set_pte_at without flush_icache_page inbetween. The end result is that
> > > the icache can become corrupted & userland processes may execute
> > > unexpected or invalid code, typically resulting in a reserved
> > > instruction exception, a trap or a segfault.
> > >
> > > Fix this race condition fully by performing any cache maintenance
> > > required to keep the icache & dcache in sync in set_pte_at, before the
> > > page is made valid. This has the added bonus of ensuring the cache
> > > maintenance always happens in one location, rather than being duplicated
> > > in flush_icache_page & update_mmu_cache. It also matches the way other
> > > architectures solve the same problem (see arm, ia64 & powerpc).
> > >
> > > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > > Reported-by: Ionela Voinescu <ionela.voinescu@imgtec.com>
> > > Cc: Lars Persson <lars.persson@axis.com>
> > > Cc: stable <stable@vger.kernel.org> # v4.1+
> > > Fixes: 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
> > >
> > > ---
> > >
> > > arch/mips/include/asm/cacheflush.h | 6 ------
> > > arch/mips/include/asm/pgtable.h | 26 +++++++++++++++++++++-----
> > > arch/mips/mm/cache.c | 19 +++----------------
> > > 3 files changed, 24 insertions(+), 27 deletions(-)
> > >
> > > diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
> > > index 7e9f468..34ed22e 100644
> > > --- a/arch/mips/include/asm/cacheflush.h
> > > +++ b/arch/mips/include/asm/cacheflush.h
> > > @@ -51,7 +51,6 @@ extern void (*flush_cache_range)(struct vm_area_struct *vma,
> > > unsigned long start, unsigned long end);
> > > extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
> > > extern void __flush_dcache_page(struct page *page);
> > > -extern void __flush_icache_page(struct vm_area_struct *vma, struct page *page);
> > >
> > > #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
> > > static inline void flush_dcache_page(struct page *page)
> > > @@ -77,11 +76,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
> > > static inline void flush_icache_page(struct vm_area_struct *vma,
> > > struct page *page)
> > > {
> > > - if (!cpu_has_ic_fills_f_dc && (vma->vm_flags & VM_EXEC) &&
> > > - Page_dcache_dirty(page)) {
> > > - __flush_icache_page(vma, page);
> > > - ClearPageDcacheDirty(page);
> > > - }
> > > }
> > >
> > > extern void (*flush_icache_range)(unsigned long start, unsigned long end);
> > > diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
> > > index 9a4fe01..65bf2c0 100644
> > > --- a/arch/mips/include/asm/pgtable.h
> > > +++ b/arch/mips/include/asm/pgtable.h
> > > @@ -127,10 +127,14 @@ do { \
> > > } \
> > > } while(0)
> > >
> > > +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> > > + pte_t *ptep, pte_t pteval);
> > > +
> > > #if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
> > >
> > > #define pte_none(pte) (!(((pte).pte_high) & ~_PAGE_GLOBAL))
> > > #define pte_present(pte) ((pte).pte_low & _PAGE_PRESENT)
> > > +#define pte_no_exec(pte) ((pte).pte_low & _PAGE_NO_EXEC)
> > >
> > > static inline void set_pte(pte_t *ptep, pte_t pte)
> > > {
> > > @@ -148,7 +152,6 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
> > > buddy->pte_high |= _PAGE_GLOBAL;
> > > }
> > > }
> > > -#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
> > >
> > > static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> > > {
> > > @@ -166,6 +169,7 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
> > >
> > > #define pte_none(pte) (!(pte_val(pte) & ~_PAGE_GLOBAL))
> > > #define pte_present(pte) (pte_val(pte) & _PAGE_PRESENT)
> > > +#define pte_no_exec(pte) (pte_val(pte) & _PAGE_NO_EXEC)
> > >
> > > /*
> > > * Certain architectures need to do special things when pte's
> > > @@ -218,7 +222,6 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)
> > > }
> > > #endif
> > > }
> > > -#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
> > >
> > > static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> > > {
> > > @@ -234,6 +237,22 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
> > > }
> > > #endif
> > >
> > > +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
> > > + pte_t *ptep, pte_t pteval)
> > > +{
> > > + extern void __update_cache(unsigned long address, pte_t pte);
> > > +
> > > + if (!pte_present(pteval))
> > > + goto cache_sync_done;
> > > +
> > > + if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
> > > + goto cache_sync_done;
> > > +
> > > + __update_cache(addr, pteval);
> > > +cache_sync_done:
> > > + set_pte(ptep, pteval);
> > > +}
> > > +
> > > /*
> > > * (pmds are folded into puds so this doesn't get actually called,
> > > * but the define is needed for a generic inline function.)
> > > @@ -430,15 +449,12 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
> > >
> > > extern void __update_tlb(struct vm_area_struct *vma, unsigned long address,
> > > pte_t pte);
> > > -extern void __update_cache(struct vm_area_struct *vma, unsigned long address,
> > > - pte_t pte);
> > >
> > > static inline void update_mmu_cache(struct vm_area_struct *vma,
> > > unsigned long address, pte_t *ptep)
> > > {
> > > pte_t pte = *ptep;
> > > __update_tlb(vma, address, pte);
> > > - __update_cache(vma, address, pte);
> > > }
> > >
> > > static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
> > > diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
> > > index 8befa55..bf04c6c 100644
> > > --- a/arch/mips/mm/cache.c
> > > +++ b/arch/mips/mm/cache.c
> > > @@ -125,30 +125,17 @@ void __flush_anon_page(struct page *page, unsigned long vmaddr)
> > >
> > > EXPORT_SYMBOL(__flush_anon_page);
> > >
> > > -void __flush_icache_page(struct vm_area_struct *vma, struct page *page)
> > > -{
> > > - unsigned long addr;
> > > -
> > > - if (PageHighMem(page))
> > > - return;
> > > -
> > > - addr = (unsigned long) page_address(page);
> > > - flush_data_cache_page(addr);
> > > -}
> > > -EXPORT_SYMBOL_GPL(__flush_icache_page);
> > > -
> > > -void __update_cache(struct vm_area_struct *vma, unsigned long address,
> > > - pte_t pte)
> > > +void __update_cache(unsigned long address, pte_t pte)
> > > {
> > > struct page *page;
> > > unsigned long pfn, addr;
> > > - int exec = (vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc;
> > > + int exec = !pte_no_exec(pte) && !cpu_has_ic_fills_f_dc;
> > >
> > > pfn = pte_pfn(pte);
> > > if (unlikely(!pfn_valid(pfn)))
> > > return;
> > > page = pfn_to_page(pfn);
> > > - if (page_mapping(page) && Page_dcache_dirty(page)) {
> > > + if (Page_dcache_dirty(page)) {
> > > if (PageHighMem(page))
> > > addr = (unsigned long)kmap_atomic(page);
> > > else
> > > --
> > > 2.7.1
> > >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page
2016-03-04 15:09 ` Lars Persson
@ 2016-03-29 8:29 ` Ralf Baechle
0 siblings, 0 replies; 14+ messages in thread
From: Ralf Baechle @ 2016-03-29 8:29 UTC (permalink / raw)
To: Lars Persson
Cc: Paul Burton, Steven J. Hill, linux-mips@linux-mips.org,
linux-kernel@vger.kernel.org
On Fri, Mar 04, 2016 at 03:09:00PM +0000, Lars Persson wrote:
> > 1 mars 2016 kl. 03:38 Paul Burton <paul.burton@imgtec.com>:
> >
> > The flush_kernel_dcache_page function was previously essentially a nop.
> > This is incorrect for MIPS, where if a page has been modified & either
> > it aliases or it's executable & the icache doesn't fill from dcache then
> > the content needs to be written back from dcache to the next level of
> > the cache hierarchy (which is shared with the icache).
> >
> > Implement this by simply calling flush_dcache_page, treating this
> > kmapped cache flush function (flush_kernel_dcache_page) exactly the same
> > as its non-kmapped counterpart (flush_dcache_page).
> >
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > ---
> >
> > arch/mips/include/asm/cacheflush.h | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
> > index 723229f..7e9f468 100644
> > --- a/arch/mips/include/asm/cacheflush.h
> > +++ b/arch/mips/include/asm/cacheflush.h
> > @@ -132,6 +132,7 @@ static inline void kunmap_noncoherent(void)
> > static inline void flush_kernel_dcache_page(struct page *page)
> > {
> > BUG_ON(cpu_has_dc_aliases && PageHighMem(page));
> > + flush_dcache_page(page);
>
> Should we use instead __flush_dcache_page() that flushes immediately for mapped pages ? Steven J Hill's old patch set for highmem had done it like this.
Delayed flushing should be ok for lowmem where each page has a permanent
virtual address. With highmem the temporary address assigned by the kmap_*
function may change so the flush needs to be performed immediately.
Special case highmem without cache aliases - the exact virtual address
doesn't matter, so this should be fine.
Cache flushes are expensive so delaying if possible is always a good thing.
Steven's patches afair were trying to tackle the highmem with aliases case
so an immediately flush was required.
Ralf
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page
2016-03-01 2:37 ` [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page Paul Burton
@ 2016-03-29 8:35 ` Ralf Baechle
2016-03-29 8:55 ` Paul Burton
0 siblings, 1 reply; 14+ messages in thread
From: Ralf Baechle @ 2016-03-29 8:35 UTC (permalink / raw)
To: Paul Burton
Cc: linux-mips, Lars Persson, linux-kernel, Andrew Morton,
Kirill A. Shutemov
On Tue, Mar 01, 2016 at 02:37:57AM +0000, Paul Burton wrote:
> When flush_dcache_page is called on an executable page, that page is
> about to be provided to userland & we can presume that the icache
> contains no valid entries for its address range. However if the icache
> does not fill from the dcache then we cannot presume that the pages
> content has been written back as far as the memories that the dcache
> will fill from (ie. L2 or further out).
>
> This was being done for lowmem pages, but not for highmem which can lead
> to icache corruption. Fix this by mapping highmem pages & flushing their
> content from the dcache in __flush_dcache_page before providing the page
> to userland, just as is done for lowmem pages.
>
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> Cc: Lars Persson <lars.persson@axis.com>
> ---
>
> arch/mips/mm/cache.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
> index 3f159ca..5a67d8c 100644
> --- a/arch/mips/mm/cache.c
> +++ b/arch/mips/mm/cache.c
> @@ -16,6 +16,7 @@
> #include <linux/mm.h>
>
> #include <asm/cacheflush.h>
> +#include <asm/highmem.h>
> #include <asm/processor.h>
> #include <asm/cpu.h>
> #include <asm/cpu-features.h>
> @@ -83,8 +84,6 @@ void __flush_dcache_page(struct page *page)
> struct address_space *mapping = page_mapping(page);
> unsigned long addr;
>
> - if (PageHighMem(page))
> - return;
> if (mapping && !mapping_mapped(mapping)) {
> SetPageDcacheDirty(page);
> return;
> @@ -95,8 +94,15 @@ void __flush_dcache_page(struct page *page)
> * case is for exec env/arg pages and those are %99 certainly going to
> * get faulted into the tlb (and thus flushed) anyways.
> */
> - addr = (unsigned long) page_address(page);
> + if (PageHighMem(page))
> + addr = (unsigned long)kmap_atomic(page);
> + else
> + addr = (unsigned long)page_address(page);
> +
> flush_data_cache_page(addr);
> +
> + if (PageHighMem(page))
> + __kunmap_atomic((void *)addr);
> }
>
> EXPORT_SYMBOL(__flush_dcache_page);
I don't see how this should work with cache aliases. If the page is unmapped
kmap_atomic will pick a deterministic address only under some circumstances,
kmap won't. As the result the wrong cache way will be flushed out, I think.
Ralf
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] MIPS: Handle highmem pages in __update_cache
2016-03-01 2:37 ` [PATCH 3/4] MIPS: Handle highmem pages in __update_cache Paul Burton
@ 2016-03-29 8:39 ` Ralf Baechle
0 siblings, 0 replies; 14+ messages in thread
From: Ralf Baechle @ 2016-03-29 8:39 UTC (permalink / raw)
To: Paul Burton
Cc: linux-mips, Lars Persson, stable # v4 . 1+, linux-kernel,
Andrew Morton, Jerome Marchand, Kirill A. Shutemov
On Tue, Mar 01, 2016 at 02:37:58AM +0000, Paul Burton wrote:
> The following patch will expose __update_cache to highmem pages. Handle
> them by mapping them in for the duration of the cache maintenance, just
> like in __flush_dcache_page. The code for that isn't shared because we
> need the page address in __update_cache so sharing became messy. Given
> that the entirity is an extra 5 lines, just duplicate it.
>
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> Cc: Lars Persson <lars.persson@axis.com>
> Cc: stable <stable@vger.kernel.org> # v4.1+
> ---
>
> arch/mips/mm/cache.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
> index 5a67d8c..8befa55 100644
> --- a/arch/mips/mm/cache.c
> +++ b/arch/mips/mm/cache.c
> @@ -149,9 +149,17 @@ void __update_cache(struct vm_area_struct *vma, unsigned long address,
> return;
> page = pfn_to_page(pfn);
> if (page_mapping(page) && Page_dcache_dirty(page)) {
> - addr = (unsigned long) page_address(page);
> + if (PageHighMem(page))
> + addr = (unsigned long)kmap_atomic(page);
> + else
> + addr = (unsigned long)page_address(page);
> +
> if (exec || pages_do_alias(addr, address & PAGE_MASK))
> flush_data_cache_page(addr);
> +
> + if (PageHighMem(page))
> + __kunmap_atomic((void *)addr);
> +
> ClearPageDcacheDirty(page);
> }
> }
Yet again this is betting the house on getting the right virtual address
from kmap_atomic.
Ralf
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page
2016-03-29 8:35 ` Ralf Baechle
@ 2016-03-29 8:55 ` Paul Burton
0 siblings, 0 replies; 14+ messages in thread
From: Paul Burton @ 2016-03-29 8:55 UTC (permalink / raw)
To: Ralf Baechle
Cc: linux-mips, Lars Persson, linux-kernel, Andrew Morton,
Kirill A. Shutemov
On Tue, Mar 29, 2016 at 10:35:43AM +0200, Ralf Baechle wrote:
> On Tue, Mar 01, 2016 at 02:37:57AM +0000, Paul Burton wrote:
>
> > When flush_dcache_page is called on an executable page, that page is
> > about to be provided to userland & we can presume that the icache
> > contains no valid entries for its address range. However if the icache
> > does not fill from the dcache then we cannot presume that the pages
> > content has been written back as far as the memories that the dcache
> > will fill from (ie. L2 or further out).
> >
> > This was being done for lowmem pages, but not for highmem which can lead
> > to icache corruption. Fix this by mapping highmem pages & flushing their
> > content from the dcache in __flush_dcache_page before providing the page
> > to userland, just as is done for lowmem pages.
> >
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Cc: Lars Persson <lars.persson@axis.com>
> > ---
> >
> > arch/mips/mm/cache.c | 12 +++++++++---
> > 1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
> > index 3f159ca..5a67d8c 100644
> > --- a/arch/mips/mm/cache.c
> > +++ b/arch/mips/mm/cache.c
> > @@ -16,6 +16,7 @@
> > #include <linux/mm.h>
> >
> > #include <asm/cacheflush.h>
> > +#include <asm/highmem.h>
> > #include <asm/processor.h>
> > #include <asm/cpu.h>
> > #include <asm/cpu-features.h>
> > @@ -83,8 +84,6 @@ void __flush_dcache_page(struct page *page)
> > struct address_space *mapping = page_mapping(page);
> > unsigned long addr;
> >
> > - if (PageHighMem(page))
> > - return;
> > if (mapping && !mapping_mapped(mapping)) {
> > SetPageDcacheDirty(page);
> > return;
> > @@ -95,8 +94,15 @@ void __flush_dcache_page(struct page *page)
> > * case is for exec env/arg pages and those are %99 certainly going to
> > * get faulted into the tlb (and thus flushed) anyways.
> > */
> > - addr = (unsigned long) page_address(page);
> > + if (PageHighMem(page))
> > + addr = (unsigned long)kmap_atomic(page);
> > + else
> > + addr = (unsigned long)page_address(page);
> > +
> > flush_data_cache_page(addr);
> > +
> > + if (PageHighMem(page))
> > + __kunmap_atomic((void *)addr);
> > }
> >
> > EXPORT_SYMBOL(__flush_dcache_page);
>
> I don't see how this should work with cache aliases. If the page is unmapped
> kmap_atomic will pick a deterministic address only under some circumstances,
> kmap won't. As the result the wrong cache way will be flushed out, I think.
>
> Ralf
Hi Ralf,
None of the systems I tested this on have cache aliases, and highmem on
systems with cache aliases is currently unsupported to the extent that
we BUG_ON such cases in flush_kernel_dcache_page.
So hopefully you won't require that this code making highmem without
aliases also fix the currently-unsupported highmem with aliases case?
Thanks,
Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-03-29 8:55 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-01 2:37 [PATCH 0/4] MIPS cache & highmem fixes Paul Burton
2016-03-01 2:37 ` [PATCH 1/4] MIPS: Flush dcache for flush_kernel_dcache_page Paul Burton
2016-03-04 15:09 ` Lars Persson
2016-03-29 8:29 ` Ralf Baechle
2016-03-01 2:37 ` [PATCH 2/4] MIPS: Flush highmem pages in __flush_dcache_page Paul Burton
2016-03-29 8:35 ` Ralf Baechle
2016-03-29 8:55 ` Paul Burton
2016-03-01 2:37 ` [PATCH 3/4] MIPS: Handle highmem pages in __update_cache Paul Burton
2016-03-29 8:39 ` Ralf Baechle
[not found] ` <1456799879-14711-5-git-send-email-paul.burton@imgtec.com>
[not found] ` <56D5CDB3.80407@caviumnetworks.com>
[not found] ` <20160301171940.GA26791@NP-P-BURTON>
2016-03-02 14:12 ` [PATCH 4/4] MIPS: Sync icache & dcache in set_pte_at Ralf Baechle
2016-03-03 3:03 ` [4/4] " Leonid Yegoshin
2016-03-04 19:02 ` [PATCH 4/4] " Lars Persson
2016-03-05 0:21 ` Paul Burton
2016-03-05 0:27 ` Paul Burton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox