From: Conor Dooley <conor.dooley@microchip.com>
To: <guoren@kernel.org>
Cc: <linux-riscv@lists.infradead.org>, <guoren@kernel.org>,
<anup@brainfault.org>, <paul.walmsley@sifive.com>,
<palmer@dabbelt.com>, <heiko@sntech.de>,
<philipp.tomsich@vrull.eu>, <linux-kernel@vger.kernel.org>,
Guo Ren <guoren@linux.alibaba.com>,
Anup Patel <apatel@ventanamicro.com>,
Palmer Dabbelt <palmer@rivosinc.com>
Subject: Re: [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash
Date: Tue, 8 Nov 2022 14:22:45 +0000 [thread overview]
Message-ID: <Y2pmNUBocENfS4uK@wendy> (raw)
In-Reply-To: <D91557B5-7E60-4A29-8669-34FF42454F8C@kernel.org>
On Tue, Nov 08, 2022 at 10:27:51AM +0000, Conor Dooley wrote:
>
>
> On 8 November 2022 10:20:44 GMT, guoren@kernel.org wrote:
> >From: Guo Ren <guoren@linux.alibaba.com>
> >
> >After use_asid_allocator enabled, the userspace application will
> >crash for stale tlb entry. Because only using cpumask_clear_cpu without
> >local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
> >set_mm_asid would cause user space application get a stale value by
> >the stale tlb entry, but set_mm_noasid is okay.
> >
> >Here is the symptom of the bug:
> >unhandled signal 11 code 0x1 (coredump)
> > 0x0000003fd6d22524 <+4>: auipc s0,0x70
> > 0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490
> >=> 0x0000003fd6d2252c <+12>: ld a5,0(s0)
> >(gdb) i r s0
> >s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21
> >(gdb) x/16 0x3fd6d92490
> >0x3fd6d92490: 0xd80ac8a8 0x0000003f
> >The core dump file shows that the value of register s0 is wrong, but the
> >value in memory is right. This is because 'ld s0, -148(s0)' use a stale
> >mapping entry in TLB and got a wrong value from a stale physical
> >address.
> >
> >When task run on CPU0, the task loaded/speculative-loaded the value of
> >address(0x3fd6d92490), and the first version of tlb mapping entry was
> >PTWed into CPU0's tlb.
> >When the task switched from CPU0 to CPU1 without local_tlb_flush_all
> >(because of asid), the task happened to write a value on address
> >(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
> >ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
> >The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
> >tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
> >So we only flushed the CPU1 tlb, and setted second version mapping
> >of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
> >stale tlb mapping entry which contained a wrong target physical address.
> >When the task happened to read that value, the bug would be raised.
> >
> >Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
> >Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >Signed-off-by: Guo Ren <guoren@kernel.org>
> >Cc: Anup Patel <apatel@ventanamicro.com>
> >Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >---
> > arch/riscv/mm/context.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> >index 7acbfbd14557..8ad6c2493e93 100644
> >--- a/arch/riscv/mm/context.c
> >+++ b/arch/riscv/mm/context.c
> >@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> > */
> > cpu = smp_processor_id();
> >
> >- cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+ if (!static_branch_unlikely(&use_asid_allocator))
> >+ cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+
> > cpumask_set_cpu(cpu, mm_cpumask(next));
> >
> > set_mm(next, cpu);
>
> This is a completely different patch to what you already sent.
> Why have you marked it RESEND rather than v2?
In addition, it seems to break the build for the nommu defconfigs.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
WARNING: multiple messages have this Message-ID (diff)
From: Conor Dooley <conor.dooley@microchip.com>
To: <guoren@kernel.org>
Cc: <linux-riscv@lists.infradead.org>, <guoren@kernel.org>,
<anup@brainfault.org>, <paul.walmsley@sifive.com>,
<palmer@dabbelt.com>, <heiko@sntech.de>,
<philipp.tomsich@vrull.eu>, <linux-kernel@vger.kernel.org>,
Guo Ren <guoren@linux.alibaba.com>,
Anup Patel <apatel@ventanamicro.com>,
Palmer Dabbelt <palmer@rivosinc.com>
Subject: Re: [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash
Date: Tue, 8 Nov 2022 14:22:45 +0000 [thread overview]
Message-ID: <Y2pmNUBocENfS4uK@wendy> (raw)
In-Reply-To: <D91557B5-7E60-4A29-8669-34FF42454F8C@kernel.org>
On Tue, Nov 08, 2022 at 10:27:51AM +0000, Conor Dooley wrote:
>
>
> On 8 November 2022 10:20:44 GMT, guoren@kernel.org wrote:
> >From: Guo Ren <guoren@linux.alibaba.com>
> >
> >After use_asid_allocator enabled, the userspace application will
> >crash for stale tlb entry. Because only using cpumask_clear_cpu without
> >local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
> >set_mm_asid would cause user space application get a stale value by
> >the stale tlb entry, but set_mm_noasid is okay.
> >
> >Here is the symptom of the bug:
> >unhandled signal 11 code 0x1 (coredump)
> > 0x0000003fd6d22524 <+4>: auipc s0,0x70
> > 0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490
> >=> 0x0000003fd6d2252c <+12>: ld a5,0(s0)
> >(gdb) i r s0
> >s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21
> >(gdb) x/16 0x3fd6d92490
> >0x3fd6d92490: 0xd80ac8a8 0x0000003f
> >The core dump file shows that the value of register s0 is wrong, but the
> >value in memory is right. This is because 'ld s0, -148(s0)' use a stale
> >mapping entry in TLB and got a wrong value from a stale physical
> >address.
> >
> >When task run on CPU0, the task loaded/speculative-loaded the value of
> >address(0x3fd6d92490), and the first version of tlb mapping entry was
> >PTWed into CPU0's tlb.
> >When the task switched from CPU0 to CPU1 without local_tlb_flush_all
> >(because of asid), the task happened to write a value on address
> >(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
> >ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
> >The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
> >tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
> >So we only flushed the CPU1 tlb, and setted second version mapping
> >of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
> >stale tlb mapping entry which contained a wrong target physical address.
> >When the task happened to read that value, the bug would be raised.
> >
> >Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
> >Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >Signed-off-by: Guo Ren <guoren@kernel.org>
> >Cc: Anup Patel <apatel@ventanamicro.com>
> >Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >---
> > arch/riscv/mm/context.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> >index 7acbfbd14557..8ad6c2493e93 100644
> >--- a/arch/riscv/mm/context.c
> >+++ b/arch/riscv/mm/context.c
> >@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> > */
> > cpu = smp_processor_id();
> >
> >- cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+ if (!static_branch_unlikely(&use_asid_allocator))
> >+ cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+
> > cpumask_set_cpu(cpu, mm_cpumask(next));
> >
> > set_mm(next, cpu);
>
> This is a completely different patch to what you already sent.
> Why have you marked it RESEND rather than v2?
In addition, it seems to break the build for the nommu defconfigs.
next prev parent reply other threads:[~2022-11-08 14:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 10:20 [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash guoren
2022-11-08 10:20 ` guoren
2022-11-08 10:27 ` Conor Dooley
2022-11-08 10:27 ` Conor Dooley
2022-11-08 14:22 ` Conor Dooley [this message]
2022-11-08 14:22 ` Conor Dooley
2022-11-09 0:30 ` Guo Ren
2022-11-09 0:30 ` Guo Ren
2022-11-09 0:30 ` Guo Ren
2022-11-09 0:30 ` Guo Ren
2022-11-09 1:42 ` kernel test robot
2022-11-09 1:42 ` kernel test robot
2022-11-09 2:33 ` kernel test robot
2022-11-09 2:33 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2pmNUBocENfS4uK@wendy \
--to=conor.dooley@microchip.com \
--cc=anup@brainfault.org \
--cc=apatel@ventanamicro.com \
--cc=guoren@kernel.org \
--cc=guoren@linux.alibaba.com \
--cc=heiko@sntech.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=palmer@rivosinc.com \
--cc=paul.walmsley@sifive.com \
--cc=philipp.tomsich@vrull.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.