From: Sergey Matyukevich <geomatsi@gmail.com>
To: linux-riscv@lists.infradead.org, linux-arch@vger.kernel.org
Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>,
Zong Li <zong.li@sifive.com>, Guo Ren <guoren@kernel.org>,
Albert Ou <aou@eecs.berkeley.edu>,
Palmer Dabbelt <palmer@dabbelt.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Sergey Matyukevich <geomatsi@gmail.com>,
Guo Ren <guoren@linux.alibaba.com>,
Sergey Matyukevich <sergey.matyukevich@syntacore.com>,
Anup Patel <apatel@ventanamicro.com>,
Palmer Dabbelt <palmer@rivosinc.com>,
stable@vger.kernel.org
Subject: [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash
Date: Sun, 26 Feb 2023 18:01:37 +0300 [thread overview]
Message-ID: <20230226150137.1919750-3-geomatsi@gmail.com> (raw)
In-Reply-To: <20230226150137.1919750-1-geomatsi@gmail.com>
From: Guo Ren <guoren@linux.alibaba.com>
After use_asid_allocator is enabled, the userspace application will
crash by stale TLB entries. Because only using cpumask_clear_cpu without
local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
Then set_mm_asid would cause the user space application to get a stale
value by stale TLB entry, but set_mm_noasid is okay.
Here is the symptom of the bug:
unhandled signal 11 code 0x1 (coredump)
0x0000003fd6d22524 <+4>: auipc s0,0x70
0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490
=> 0x0000003fd6d2252c <+12>: ld a5,0(s0)
(gdb) i r s0
s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21
(gdb) x /2x 0x3fd6d92490
0x3fd6d92490: 0xd80ac8a8 0x0000003f
The core dump file shows that register s0 is wrong, but the value in
memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry
in TLB and got a wrong result from an incorrect physical address.
When the task ran on CPU0, which loaded/speculative-loaded the value of
address(0x3fd6d92490), then the first version of the mapping entry was
PTWed into CPU0's TLB.
When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by
asid), it happened to write a value on the address (0x3fd6d92490). It
caused do_page_fault -> wp_page_copy -> ptep_clear_flush ->
ptep_get_and_clear & flush_tlb_page.
The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB
flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous
switch_mm. So we only flushed the CPU1 TLB and set the second version
mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0
still used a stale TLB mapping entry which contained a wrong target
physical address. It raised a bug when the task happened to read that
value.
CPU0 CPU1
- switch 'task' in
- read addr (Fill stale mapping
entry into TLB)
- switch 'task' out (no tlb_flush)
- switch 'task' in (no tlb_flush)
- write addr cause pagefault
do_page_fault() (change to
new addr mapping)
wp_page_copy()
ptep_clear_flush()
ptep_get_and_clear()
& flush_tlb_page()
write new value into addr
- switch 'task' out (no tlb_flush)
- switch 'task' in (no tlb_flush)
- read addr again (Use stale
mapping entry in TLB)
get wrong value from old phyical
addr, BUG!
The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm,
which could guarantee to invalidate all stale TLB entries during TLB
flush.
Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Zong Li <zong.li@sifive.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Cc: Anup Patel <apatel@ventanamicro.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: stable@vger.kernel.org
---
arch/riscv/mm/context.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 7acbfbd14557..0f784e3d307b 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -205,12 +205,24 @@ static void set_mm_noasid(struct mm_struct *mm)
local_flush_tlb_all();
}
-static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
+static inline void set_mm(struct mm_struct *prev,
+ struct mm_struct *next, unsigned int cpu)
{
- if (static_branch_unlikely(&use_asid_allocator))
- set_mm_asid(mm, cpu);
- else
- set_mm_noasid(mm);
+ /*
+ * The mm_cpumask indicates which harts' TLBs contain the virtual
+ * address mapping of the mm. Compared to noasid, using asid
+ * can't guarantee that stale TLB entries are invalidated because
+ * the asid mechanism wouldn't flush TLB for every switch_mm for
+ * performance. So when using asid, keep all CPUs footmarks in
+ * cpumask() until mm reset.
+ */
+ cpumask_set_cpu(cpu, mm_cpumask(next));
+ if (static_branch_unlikely(&use_asid_allocator)) {
+ set_mm_asid(next, cpu);
+ } else {
+ cpumask_clear_cpu(cpu, mm_cpumask(prev));
+ set_mm_noasid(next);
+ }
}
static int __init asids_init(void)
@@ -264,7 +276,8 @@ static int __init asids_init(void)
}
early_initcall(asids_init);
#else
-static inline void set_mm(struct mm_struct *mm, unsigned int cpu)
+static inline void set_mm(struct mm_struct *prev,
+ struct mm_struct *next, unsigned int cpu)
{
/* Nothing to do here when there is no MMU */
}
@@ -317,10 +330,7 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
*/
cpu = smp_processor_id();
- cpumask_clear_cpu(cpu, mm_cpumask(prev));
- cpumask_set_cpu(cpu, mm_cpumask(next));
-
- set_mm(next, cpu);
+ set_mm(prev, next, cpu);
flush_icache_deferred(next, cpu);
}
--
2.39.2
next prev parent reply other threads:[~2023-02-26 15:24 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-26 15:01 [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries Sergey Matyukevich
2023-02-26 15:01 ` [PATCH 1/2] Revert "riscv: mm: notify remote harts about mmu cache updates" Sergey Matyukevich
2023-02-28 3:15 ` Guo Ren
2023-02-26 15:01 ` Sergey Matyukevich [this message]
2023-02-26 16:44 ` [PATCH 2/2] riscv: asid: Fixup stale TLB entry cause application crash Andrew Jones
2023-03-10 2:30 ` [PATCH 0/2] riscv: asid: switch to alternative way to fix stale TLB entries patchwork-bot+linux-riscv
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230226150137.1919750-3-geomatsi@gmail.com \
--to=geomatsi@gmail.com \
--cc=aou@eecs.berkeley.edu \
--cc=apatel@ventanamicro.com \
--cc=guoren@kernel.org \
--cc=guoren@linux.alibaba.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=palmer@rivosinc.com \
--cc=paul.walmsley@sifive.com \
--cc=prabhakar.mahadev-lad.rj@bp.renesas.com \
--cc=sergey.matyukevich@syntacore.com \
--cc=stable@vger.kernel.org \
--cc=zong.li@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox