* [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries
@ 2023-02-26 15:01 Sergey Matyukevich
2023-02-26 15:01 ` [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" Sergey Matyukevich
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Sergey Matyukevich @ 2023-02-26 15:01 UTC (permalink / raw)
To: linux-riscv, linux-arch
Cc: Lad Prabhakar, Zong Li, Guo Ren, Albert Ou, Palmer Dabbelt,
Paul Walmsley, Sergey Matyukevich
Hi all,
Some time ago two different patches have been posted to fix stale TLB
entries that caused applications crashes.
The patch [0] suggested 'aggregating' mm_cpumask, i.e. current cpu is not
cleared for the switched-out task in switch_mm function. For additional
explanations see the commit message by Guo Ren. The same approach is
used by arc architecture, so another good comment is for switch_mm
in arch/arc/include/asm/mmu_context.h.
The patch [1] attempted to reduce the number of TLB flushes by deferring
(and possibly avoiding) them for CPUs not running the task.
Patch [1] has been merged. However we already have two bug reports from
different vendors. So apparently something is missing in the approach
suggested in [1]. In both cases the patch [0] fixed the issue.
This patch series reverts [1] and replaces it by [0].
Regards,
Sergey
[0] https://lore.kernel.org/linux-riscv/20221111075902.798571-1-guoren@kernel.org/
[1] https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/
Guo Ren (1):
riscv: asid: Fixup stale TLB entry cause application crash
Sergey Matyukevich (1):
Revert "riscv: mm: notify remote harts about mmu cache updates"
arch/riscv/include/asm/mmu.h | 2 --
arch/riscv/include/asm/tlbflush.h | 18 --------------
arch/riscv/mm/context.c | 40 +++++++++++++++----------------
arch/riscv/mm/tlbflush.c | 28 +++++++++++++---------
4 files changed, 37 insertions(+), 51 deletions(-)
--
2.39.2
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" 2023-02-26 15:01 [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries Sergey Matyukevich @ 2023-02-26 15:01 ` Sergey Matyukevich 2023-02-28 3:15 ` Guo Ren 2023-02-26 15:01 ` [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash Sergey Matyukevich 2023-03-10 2:30 ` [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries patchwork-bot+linux-riscv 2 siblings, 1 reply; 6+ messages in thread From: Sergey Matyukevich @ 2023-02-26 15:01 UTC (permalink / raw) To: linux-riscv, linux-arch Cc: Lad Prabhakar, Zong Li, Guo Ren, Albert Ou, Palmer Dabbelt, Paul Walmsley, Sergey Matyukevich, Sergey Matyukevich, stable From: Sergey Matyukevich <sergey.matyukevich@syntacore.com> This reverts the remaining bits of commit 4bd1d80efb5a ("riscv: mm: notify remote harts harts about mmu cache updates"). According to bug reports, suggested approach to fix stale TLB entries is not sufficient. It needs to be replaced by a more robust solution. Fixes: 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates") Reported-by: Zong Li <zong.li@sifive.com> Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Cc: stable@vger.kernel.org --- arch/riscv/include/asm/mmu.h | 2 -- arch/riscv/include/asm/tlbflush.h | 18 ------------------ arch/riscv/mm/context.c | 10 ---------- arch/riscv/mm/tlbflush.c | 28 +++++++++++++++++----------- 4 files changed, 17 insertions(+), 41 deletions(-) diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h index 5ff1f19fd45c..0099dc116168 100644 --- a/arch/riscv/include/asm/mmu.h +++ b/arch/riscv/include/asm/mmu.h @@ -19,8 +19,6 @@ typedef struct { #ifdef CONFIG_SMP /* A local icache flush is needed before user execution can resume. */ cpumask_t icache_stale_mask; - /* A local tlb flush is needed before user execution can resume. */ - cpumask_t tlb_stale_mask; #endif } mm_context_t; diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h index 907b9efd39a8..801019381dea 100644 --- a/arch/riscv/include/asm/tlbflush.h +++ b/arch/riscv/include/asm/tlbflush.h @@ -22,24 +22,6 @@ static inline void local_flush_tlb_page(unsigned long addr) { ALT_FLUSH_TLB_PAGE(__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory")); } - -static inline void local_flush_tlb_all_asid(unsigned long asid) -{ - __asm__ __volatile__ ("sfence.vma x0, %0" - : - : "r" (asid) - : "memory"); -} - -static inline void local_flush_tlb_page_asid(unsigned long addr, - unsigned long asid) -{ - __asm__ __volatile__ ("sfence.vma %0, %1" - : - : "r" (addr), "r" (asid) - : "memory"); -} - #else /* CONFIG_MMU */ #define local_flush_tlb_all() do { } while (0) #define local_flush_tlb_page(addr) do { } while (0) diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c index 80ce9caba8d2..7acbfbd14557 100644 --- a/arch/riscv/mm/context.c +++ b/arch/riscv/mm/context.c @@ -196,16 +196,6 @@ static void set_mm_asid(struct mm_struct *mm, unsigned int cpu) if (need_flush_tlb) local_flush_tlb_all(); -#ifdef CONFIG_SMP - else { - cpumask_t *mask = &mm->context.tlb_stale_mask; - - if (cpumask_test_cpu(cpu, mask)) { - cpumask_clear_cpu(cpu, mask); - local_flush_tlb_all_asid(cntx & asid_mask); - } - } -#endif } static void set_mm_noasid(struct mm_struct *mm) diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index ce7dfc81bb3f..37ed760d007c 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -5,7 +5,23 @@ #include <linux/sched.h> #include <asm/sbi.h> #include <asm/mmu_context.h> -#include <asm/tlbflush.h> + +static inline void local_flush_tlb_all_asid(unsigned long asid) +{ + __asm__ __volatile__ ("sfence.vma x0, %0" + : + : "r" (asid) + : "memory"); +} + +static inline void local_flush_tlb_page_asid(unsigned long addr, + unsigned long asid) +{ + __asm__ __volatile__ ("sfence.vma %0, %1" + : + : "r" (addr), "r" (asid) + : "memory"); +} void flush_tlb_all(void) { @@ -15,7 +31,6 @@ void flush_tlb_all(void) static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start, unsigned long size, unsigned long stride) { - struct cpumask *pmask = &mm->context.tlb_stale_mask; struct cpumask *cmask = mm_cpumask(mm); unsigned int cpuid; bool broadcast; @@ -29,15 +44,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start, if (static_branch_unlikely(&use_asid_allocator)) { unsigned long asid = atomic_long_read(&mm->context.id); - /* - * TLB will be immediately flushed on harts concurrently - * executing this MM context. TLB flush on other harts - * is deferred until this MM context migrates there. - */ - cpumask_setall(pmask); - cpumask_clear_cpu(cpuid, pmask); - cpumask_andnot(pmask, pmask, cmask); - if (broadcast) { sbi_remote_sfence_vma_asid(cmask, start, size, asid); } else if (size <= stride) { -- 2.39.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" 2023-02-26 15:01 ` [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" Sergey Matyukevich @ 2023-02-28 3:15 ` Guo Ren 0 siblings, 0 replies; 6+ messages in thread From: Guo Ren @ 2023-02-28 3:15 UTC (permalink / raw) To: Sergey Matyukevich Cc: linux-riscv, linux-arch, Lad Prabhakar, Zong Li, Albert Ou, Palmer Dabbelt, Paul Walmsley, Sergey Matyukevich, stable Thx Reviewed-by: Guo Ren <guoren@kernel.org> On Sun, Feb 26, 2023 at 11:02 PM Sergey Matyukevich <geomatsi@gmail.com> wrote: > > From: Sergey Matyukevich <sergey.matyukevich@syntacore.com> > > This reverts the remaining bits of commit 4bd1d80efb5a ("riscv: mm: > notify remote harts harts about mmu cache updates"). > > According to bug reports, suggested approach to fix stale TLB entries > is not sufficient. It needs to be replaced by a more robust solution. > > Fixes: 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates") > Reported-by: Zong Li <zong.li@sifive.com> > Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> > Cc: stable@vger.kernel.org > > --- > arch/riscv/include/asm/mmu.h | 2 -- > arch/riscv/include/asm/tlbflush.h | 18 ------------------ > arch/riscv/mm/context.c | 10 ---------- > arch/riscv/mm/tlbflush.c | 28 +++++++++++++++++----------- > 4 files changed, 17 insertions(+), 41 deletions(-) > > diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h > index 5ff1f19fd45c..0099dc116168 100644 > --- a/arch/riscv/include/asm/mmu.h > +++ b/arch/riscv/include/asm/mmu.h > @@ -19,8 +19,6 @@ typedef struct { > #ifdef CONFIG_SMP > /* A local icache flush is needed before user execution can resume. */ > cpumask_t icache_stale_mask; > - /* A local tlb flush is needed before user execution can resume. */ > - cpumask_t tlb_stale_mask; > #endif > } mm_context_t; > > diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h > index 907b9efd39a8..801019381dea 100644 > --- a/arch/riscv/include/asm/tlbflush.h > +++ b/arch/riscv/include/asm/tlbflush.h > @@ -22,24 +22,6 @@ static inline void local_flush_tlb_page(unsigned long addr) > { > ALT_FLUSH_TLB_PAGE(__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory")); > } > - > -static inline void local_flush_tlb_all_asid(unsigned long asid) > -{ > - __asm__ __volatile__ ("sfence.vma x0, %0" > - : > - : "r" (asid) > - : "memory"); > -} > - > -static inline void local_flush_tlb_page_asid(unsigned long addr, > - unsigned long asid) > -{ > - __asm__ __volatile__ ("sfence.vma %0, %1" > - : > - : "r" (addr), "r" (asid) > - : "memory"); > -} > - > #else /* CONFIG_MMU */ > #define local_flush_tlb_all() do { } while (0) > #define local_flush_tlb_page(addr) do { } while (0) > diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c > index 80ce9caba8d2..7acbfbd14557 100644 > --- a/arch/riscv/mm/context.c > +++ b/arch/riscv/mm/context.c > @@ -196,16 +196,6 @@ static void set_mm_asid(struct mm_struct *mm, unsigned int cpu) > > if (need_flush_tlb) > local_flush_tlb_all(); > -#ifdef CONFIG_SMP > - else { > - cpumask_t *mask = &mm->context.tlb_stale_mask; > - > - if (cpumask_test_cpu(cpu, mask)) { > - cpumask_clear_cpu(cpu, mask); > - local_flush_tlb_all_asid(cntx & asid_mask); > - } > - } > -#endif > } > > static void set_mm_noasid(struct mm_struct *mm) > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > index ce7dfc81bb3f..37ed760d007c 100644 > --- a/arch/riscv/mm/tlbflush.c > +++ b/arch/riscv/mm/tlbflush.c > @@ -5,7 +5,23 @@ > #include <linux/sched.h> > #include <asm/sbi.h> > #include <asm/mmu_context.h> > -#include <asm/tlbflush.h> > + > +static inline void local_flush_tlb_all_asid(unsigned long asid) > +{ > + __asm__ __volatile__ ("sfence.vma x0, %0" > + : > + : "r" (asid) > + : "memory"); > +} > + > +static inline void local_flush_tlb_page_asid(unsigned long addr, > + unsigned long asid) > +{ > + __asm__ __volatile__ ("sfence.vma %0, %1" > + : > + : "r" (addr), "r" (asid) > + : "memory"); > +} > > void flush_tlb_all(void) > { > @@ -15,7 +31,6 @@ void flush_tlb_all(void) > static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start, > unsigned long size, unsigned long stride) > { > - struct cpumask *pmask = &mm->context.tlb_stale_mask; > struct cpumask *cmask = mm_cpumask(mm); > unsigned int cpuid; > bool broadcast; > @@ -29,15 +44,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start, > if (static_branch_unlikely(&use_asid_allocator)) { > unsigned long asid = atomic_long_read(&mm->context.id); > > - /* > - * TLB will be immediately flushed on harts concurrently > - * executing this MM context. TLB flush on other harts > - * is deferred until this MM context migrates there. > - */ > - cpumask_setall(pmask); > - cpumask_clear_cpu(cpuid, pmask); > - cpumask_andnot(pmask, pmask, cmask); > - > if (broadcast) { > sbi_remote_sfence_vma_asid(cmask, start, size, asid); > } else if (size <= stride) { > -- > 2.39.2 > -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash 2023-02-26 15:01 [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries Sergey Matyukevich 2023-02-26 15:01 ` [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" Sergey Matyukevich @ 2023-02-26 15:01 ` Sergey Matyukevich 2023-02-26 16:44 ` Andrew Jones 2023-03-10 2:30 ` [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries patchwork-bot+linux-riscv 2 siblings, 1 reply; 6+ messages in thread From: Sergey Matyukevich @ 2023-02-26 15:01 UTC (permalink / raw) To: linux-riscv, linux-arch Cc: Lad Prabhakar, Zong Li, Guo Ren, Albert Ou, Palmer Dabbelt, Paul Walmsley, Sergey Matyukevich, Guo Ren, Sergey Matyukevich, Anup Patel, Palmer Dabbelt, stable From: Guo Ren <guoren@linux.alibaba.com> After use_asid_allocator is enabled, the userspace application will crash by stale TLB entries. Because only using cpumask_clear_cpu without local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh. Then set_mm_asid would cause the user space application to get a stale value by stale TLB entry, but set_mm_noasid is okay. Here is the symptom of the bug: unhandled signal 11 code 0x1 (coredump) 0x0000003fd6d22524 <+4>: auipc s0,0x70 0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490 => 0x0000003fd6d2252c <+12>: ld a5,0(s0) (gdb) i r s0 s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21 (gdb) x /2x 0x3fd6d92490 0x3fd6d92490: 0xd80ac8a8 0x0000003f The core dump file shows that register s0 is wrong, but the value in memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry in TLB and got a wrong result from an incorrect physical address. When the task ran on CPU0, which loaded/speculative-loaded the value of address(0x3fd6d92490), then the first version of the mapping entry was PTWed into CPU0's TLB. When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by asid), it happened to write a value on the address (0x3fd6d92490). It caused do_page_fault -> wp_page_copy -> ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page. The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous switch_mm. So we only flushed the CPU1 TLB and set the second version mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0 still used a stale TLB mapping entry which contained a wrong target physical address. It raised a bug when the task happened to read that value. CPU0 CPU1 - switch 'task' in - read addr (Fill stale mapping entry into TLB) - switch 'task' out (no tlb_flush) - switch 'task' in (no tlb_flush) - write addr cause pagefault do_page_fault() (change to new addr mapping) wp_page_copy() ptep_clear_flush() ptep_get_and_clear() & flush_tlb_page() write new value into addr - switch 'task' out (no tlb_flush) - switch 'task' in (no tlb_flush) - read addr again (Use stale mapping entry in TLB) get wrong value from old phyical addr, BUG! The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm, which could guarantee to invalidate all stale TLB entries during TLB flush. Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Zong Li <zong.li@sifive.com> Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Cc: Anup Patel <apatel@ventanamicro.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: stable@vger.kernel.org --- arch/riscv/mm/context.c | 30 ++++++++++++++++++++---------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c index 7acbfbd14557..0f784e3d307b 100644 --- a/arch/riscv/mm/context.c +++ b/arch/riscv/mm/context.c @@ -205,12 +205,24 @@ static void set_mm_noasid(struct mm_struct *mm) local_flush_tlb_all(); } -static inline void set_mm(struct mm_struct *mm, unsigned int cpu) +static inline void set_mm(struct mm_struct *prev, + struct mm_struct *next, unsigned int cpu) { - if (static_branch_unlikely(&use_asid_allocator)) - set_mm_asid(mm, cpu); - else - set_mm_noasid(mm); + /* + * The mm_cpumask indicates which harts' TLBs contain the virtual + * address mapping of the mm. Compared to noasid, using asid + * can't guarantee that stale TLB entries are invalidated because + * the asid mechanism wouldn't flush TLB for every switch_mm for + * performance. So when using asid, keep all CPUs footmarks in + * cpumask() until mm reset. + */ + cpumask_set_cpu(cpu, mm_cpumask(next)); + if (static_branch_unlikely(&use_asid_allocator)) { + set_mm_asid(next, cpu); + } else { + cpumask_clear_cpu(cpu, mm_cpumask(prev)); + set_mm_noasid(next); + } } static int __init asids_init(void) @@ -264,7 +276,8 @@ static int __init asids_init(void) } early_initcall(asids_init); #else -static inline void set_mm(struct mm_struct *mm, unsigned int cpu) +static inline void set_mm(struct mm_struct *prev, + struct mm_struct *next, unsigned int cpu) { /* Nothing to do here when there is no MMU */ } @@ -317,10 +330,7 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, */ cpu = smp_processor_id(); - cpumask_clear_cpu(cpu, mm_cpumask(prev)); - cpumask_set_cpu(cpu, mm_cpumask(next)); - - set_mm(next, cpu); + set_mm(prev, next, cpu); flush_icache_deferred(next, cpu); } -- 2.39.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash 2023-02-26 15:01 ` [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash Sergey Matyukevich @ 2023-02-26 16:44 ` Andrew Jones 0 siblings, 0 replies; 6+ messages in thread From: Andrew Jones @ 2023-02-26 16:44 UTC (permalink / raw) To: Sergey Matyukevich Cc: linux-riscv, linux-arch, Lad Prabhakar, Zong Li, Guo Ren, Albert Ou, Palmer Dabbelt, Paul Walmsley, Guo Ren, Sergey Matyukevich, Anup Patel, Palmer Dabbelt, stable On Sun, Feb 26, 2023 at 06:01:37PM +0300, Sergey Matyukevich wrote: > From: Guo Ren <guoren@linux.alibaba.com> > > After use_asid_allocator is enabled, the userspace application will > crash by stale TLB entries. Because only using cpumask_clear_cpu without > local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh. > Then set_mm_asid would cause the user space application to get a stale > value by stale TLB entry, but set_mm_noasid is okay. > > Here is the symptom of the bug: > unhandled signal 11 code 0x1 (coredump) > 0x0000003fd6d22524 <+4>: auipc s0,0x70 > 0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490 > => 0x0000003fd6d2252c <+12>: ld a5,0(s0) > (gdb) i r s0 > s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21 > (gdb) x /2x 0x3fd6d92490 > 0x3fd6d92490: 0xd80ac8a8 0x0000003f > The core dump file shows that register s0 is wrong, but the value in > memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry > in TLB and got a wrong result from an incorrect physical address. > > When the task ran on CPU0, which loaded/speculative-loaded the value of > address(0x3fd6d92490), then the first version of the mapping entry was > PTWed into CPU0's TLB. > When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by > asid), it happened to write a value on the address (0x3fd6d92490). It > caused do_page_fault -> wp_page_copy -> ptep_clear_flush -> > ptep_get_and_clear & flush_tlb_page. > The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB > flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous > switch_mm. So we only flushed the CPU1 TLB and set the second version > mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0 > still used a stale TLB mapping entry which contained a wrong target > physical address. It raised a bug when the task happened to read that > value. > > CPU0 CPU1 > - switch 'task' in > - read addr (Fill stale mapping > entry into TLB) > - switch 'task' out (no tlb_flush) > - switch 'task' in (no tlb_flush) > - write addr cause pagefault > do_page_fault() (change to > new addr mapping) > wp_page_copy() > ptep_clear_flush() > ptep_get_and_clear() > & flush_tlb_page() > write new value into addr > - switch 'task' out (no tlb_flush) > - switch 'task' in (no tlb_flush) > - read addr again (Use stale > mapping entry in TLB) > get wrong value from old phyical > addr, BUG! > > The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm, > which could guarantee to invalidate all stale TLB entries during TLB > flush. > > Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator") > Signed-off-by: Guo Ren <guoren@linux.alibaba.com> > Signed-off-by: Guo Ren <guoren@kernel.org> > Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > Tested-by: Zong Li <zong.li@sifive.com> > Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> > Cc: Anup Patel <apatel@ventanamicro.com> > Cc: Palmer Dabbelt <palmer@rivosinc.com> > Cc: stable@vger.kernel.org > > --- > arch/riscv/mm/context.c | 30 ++++++++++++++++++++---------- > 1 file changed, 20 insertions(+), 10 deletions(-) > > diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c > index 7acbfbd14557..0f784e3d307b 100644 > --- a/arch/riscv/mm/context.c > +++ b/arch/riscv/mm/context.c > @@ -205,12 +205,24 @@ static void set_mm_noasid(struct mm_struct *mm) > local_flush_tlb_all(); > } > > -static inline void set_mm(struct mm_struct *mm, unsigned int cpu) > +static inline void set_mm(struct mm_struct *prev, > + struct mm_struct *next, unsigned int cpu) > { > - if (static_branch_unlikely(&use_asid_allocator)) > - set_mm_asid(mm, cpu); > - else > - set_mm_noasid(mm); > + /* > + * The mm_cpumask indicates which harts' TLBs contain the virtual > + * address mapping of the mm. Compared to noasid, using asid > + * can't guarantee that stale TLB entries are invalidated because > + * the asid mechanism wouldn't flush TLB for every switch_mm for > + * performance. So when using asid, keep all CPUs footmarks in > + * cpumask() until mm reset. > + */ > + cpumask_set_cpu(cpu, mm_cpumask(next)); > + if (static_branch_unlikely(&use_asid_allocator)) { > + set_mm_asid(next, cpu); > + } else { > + cpumask_clear_cpu(cpu, mm_cpumask(prev)); > + set_mm_noasid(next); > + } > } > > static int __init asids_init(void) > @@ -264,7 +276,8 @@ static int __init asids_init(void) > } > early_initcall(asids_init); > #else > -static inline void set_mm(struct mm_struct *mm, unsigned int cpu) > +static inline void set_mm(struct mm_struct *prev, > + struct mm_struct *next, unsigned int cpu) > { > /* Nothing to do here when there is no MMU */ > } > @@ -317,10 +330,7 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, > */ > cpu = smp_processor_id(); > > - cpumask_clear_cpu(cpu, mm_cpumask(prev)); > - cpumask_set_cpu(cpu, mm_cpumask(next)); > - > - set_mm(next, cpu); > + set_mm(prev, next, cpu); > > flush_icache_deferred(next, cpu); > } > -- > 2.39.2 > This is identical to what I reviewed before, so my r-b could have been kept, anyway here it is again Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Thanks, drew _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries 2023-02-26 15:01 [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries Sergey Matyukevich 2023-02-26 15:01 ` [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" Sergey Matyukevich 2023-02-26 15:01 ` [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash Sergey Matyukevich @ 2023-03-10 2:30 ` patchwork-bot+linux-riscv 2 siblings, 0 replies; 6+ messages in thread From: patchwork-bot+linux-riscv @ 2023-03-10 2:30 UTC (permalink / raw) To: Sergey Matyukevich Cc: linux-riscv, linux-arch, prabhakar.mahadev-lad.rj, zong.li, guoren, aou, palmer, paul.walmsley Hello: This series was applied to riscv/linux.git (fixes) by Palmer Dabbelt <palmer@rivosinc.com>: On Sun, 26 Feb 2023 18:01:35 +0300 you wrote: > Hi all, > > Some time ago two different patches have been posted to fix stale TLB > entries that caused applications crashes. > > The patch [0] suggested 'aggregating' mm_cpumask, i.e. current cpu is not > cleared for the switched-out task in switch_mm function. For additional > explanations see the commit message by Guo Ren. The same approach is > used by arc architecture, so another good comment is for switch_mm > in arch/arc/include/asm/mmu_context.h. > > [...] Here is the summary with links: - [1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" https://git.kernel.org/riscv/c/e921050022f1 - [2/2] riscv: asid: Fixup stale TLB entry cause application crash https://git.kernel.org/riscv/c/82dd33fde026 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-03-10 2:30 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-26 15:01 [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries Sergey Matyukevich 2023-02-26 15:01 ` [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" Sergey Matyukevich 2023-02-28 3:15 ` Guo Ren 2023-02-26 15:01 ` [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash Sergey Matyukevich 2023-02-26 16:44 ` Andrew Jones 2023-03-10 2:30 ` [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox