* [PATCH] target/riscv: Exit current TB after an sfence.vma
@ 2022-03-15 19:23 Idan Horowitz
2022-03-15 19:37 ` Richard Henderson
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Idan Horowitz @ 2022-03-15 19:23 UTC (permalink / raw)
To: qemu-riscv
Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, qemu-devel,
Idan Horowitz
If the pages which control the translation of the currently executing
instructions are changed, and then the TLB is flushed using sfence.vma
we have to exit the current TB early, to ensure we don't execute stale
instructions.
Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
---
target/riscv/insn_trans/trans_privileged.c.inc | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/target/riscv/insn_trans/trans_privileged.c.inc b/target/riscv/insn_trans/trans_privileged.c.inc
index 53613682e8..f265e8202d 100644
--- a/target/riscv/insn_trans/trans_privileged.c.inc
+++ b/target/riscv/insn_trans/trans_privileged.c.inc
@@ -114,6 +114,13 @@ static bool trans_sfence_vma(DisasContext *ctx, arg_sfence_vma *a)
{
#ifndef CONFIG_USER_ONLY
gen_helper_tlb_flush(cpu_env);
+ /*
+ * The flush might have changed the backing physical memory of
+ * the instructions we're currently executing
+ */
+ gen_set_pc_imm(ctx, ctx->pc_succ_insn);
+ tcg_gen_exit_tb(NULL, 0);
+ ctx->base.is_jmp = DISAS_NORETURN;
return true;
#endif
return false;
--
2.35.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-15 19:23 [PATCH] target/riscv: Exit current TB after an sfence.vma Idan Horowitz @ 2022-03-15 19:37 ` Richard Henderson 2022-03-15 22:52 ` Alistair Francis 2022-03-15 23:42 ` Alistair Francis 2 siblings, 0 replies; 9+ messages in thread From: Richard Henderson @ 2022-03-15 19:37 UTC (permalink / raw) To: Idan Horowitz, qemu-riscv Cc: Palmer Dabbelt, Bin Meng, Alistair Francis, qemu-devel On 3/15/22 12:23, Idan Horowitz wrote: > If the pages which control the translation of the currently executing > instructions are changed, and then the TLB is flushed using sfence.vma > we have to exit the current TB early, to ensure we don't execute stale > instructions. > > Signed-off-by: Idan Horowitz<idan.horowitz@gmail.com> > --- > target/riscv/insn_trans/trans_privileged.c.inc | 7 +++++++ > 1 file changed, 7 insertions(+) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-15 19:23 [PATCH] target/riscv: Exit current TB after an sfence.vma Idan Horowitz 2022-03-15 19:37 ` Richard Henderson @ 2022-03-15 22:52 ` Alistair Francis 2022-03-15 23:42 ` Alistair Francis 2 siblings, 0 replies; 9+ messages in thread From: Alistair Francis @ 2022-03-15 22:52 UTC (permalink / raw) To: Idan Horowitz Cc: Alistair Francis, Bin Meng, open list:RISC-V, Palmer Dabbelt, qemu-devel@nongnu.org Developers On Wed, Mar 16, 2022 at 5:26 AM Idan Horowitz <idan.horowitz@gmail.com> wrote: > > If the pages which control the translation of the currently executing > instructions are changed, and then the TLB is flushed using sfence.vma > we have to exit the current TB early, to ensure we don't execute stale > instructions. > > Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Alistair > --- > target/riscv/insn_trans/trans_privileged.c.inc | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/target/riscv/insn_trans/trans_privileged.c.inc b/target/riscv/insn_trans/trans_privileged.c.inc > index 53613682e8..f265e8202d 100644 > --- a/target/riscv/insn_trans/trans_privileged.c.inc > +++ b/target/riscv/insn_trans/trans_privileged.c.inc > @@ -114,6 +114,13 @@ static bool trans_sfence_vma(DisasContext *ctx, arg_sfence_vma *a) > { > #ifndef CONFIG_USER_ONLY > gen_helper_tlb_flush(cpu_env); > + /* > + * The flush might have changed the backing physical memory of > + * the instructions we're currently executing > + */ > + gen_set_pc_imm(ctx, ctx->pc_succ_insn); > + tcg_gen_exit_tb(NULL, 0); > + ctx->base.is_jmp = DISAS_NORETURN; > return true; > #endif > return false; > -- > 2.35.1 > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-15 19:23 [PATCH] target/riscv: Exit current TB after an sfence.vma Idan Horowitz 2022-03-15 19:37 ` Richard Henderson 2022-03-15 22:52 ` Alistair Francis @ 2022-03-15 23:42 ` Alistair Francis 2022-03-30 6:09 ` Alistair Francis 2 siblings, 1 reply; 9+ messages in thread From: Alistair Francis @ 2022-03-15 23:42 UTC (permalink / raw) To: Idan Horowitz Cc: Alistair Francis, Bin Meng, open list:RISC-V, Palmer Dabbelt, qemu-devel@nongnu.org Developers On Wed, Mar 16, 2022 at 5:26 AM Idan Horowitz <idan.horowitz@gmail.com> wrote: > > If the pages which control the translation of the currently executing > instructions are changed, and then the TLB is flushed using sfence.vma > we have to exit the current TB early, to ensure we don't execute stale > instructions. > > Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com> Thanks! Applied to riscv-to-apply.next Alistair > --- > target/riscv/insn_trans/trans_privileged.c.inc | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/target/riscv/insn_trans/trans_privileged.c.inc b/target/riscv/insn_trans/trans_privileged.c.inc > index 53613682e8..f265e8202d 100644 > --- a/target/riscv/insn_trans/trans_privileged.c.inc > +++ b/target/riscv/insn_trans/trans_privileged.c.inc > @@ -114,6 +114,13 @@ static bool trans_sfence_vma(DisasContext *ctx, arg_sfence_vma *a) > { > #ifndef CONFIG_USER_ONLY > gen_helper_tlb_flush(cpu_env); > + /* > + * The flush might have changed the backing physical memory of > + * the instructions we're currently executing > + */ > + gen_set_pc_imm(ctx, ctx->pc_succ_insn); > + tcg_gen_exit_tb(NULL, 0); > + ctx->base.is_jmp = DISAS_NORETURN; > return true; > #endif > return false; > -- > 2.35.1 > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-15 23:42 ` Alistair Francis @ 2022-03-30 6:09 ` Alistair Francis 0 siblings, 0 replies; 9+ messages in thread From: Alistair Francis @ 2022-03-30 6:09 UTC (permalink / raw) To: Idan Horowitz Cc: Alistair Francis, Bin Meng, open list:RISC-V, Palmer Dabbelt, qemu-devel@nongnu.org Developers On Wed, Mar 16, 2022 at 9:42 AM Alistair Francis <alistair23@gmail.com> wrote: > > On Wed, Mar 16, 2022 at 5:26 AM Idan Horowitz <idan.horowitz@gmail.com> wrote: > > > > If the pages which control the translation of the currently executing > > instructions are changed, and then the TLB is flushed using sfence.vma > > we have to exit the current TB early, to ensure we don't execute stale > > instructions. > > > > Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com> > > Thanks! > > Applied to riscv-to-apply.next This results in failed Linux boots, I have dropped this patch Alistair ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <7f383fc2.81a2.17f93c0dad7.Coremail.phantom@zju.edu.cn>]
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma [not found] <7f383fc2.81a2.17f93c0dad7.Coremail.phantom@zju.edu.cn> @ 2022-03-29 23:15 ` Atish Patra 2022-03-30 6:15 ` Idan Horowitz 0 siblings, 1 reply; 9+ messages in thread From: Atish Patra @ 2022-03-29 23:15 UTC (permalink / raw) To: phantom Cc: open list:RISC-V, Alistair Francis, idan.horowitz, qemu-devel@nongnu.org Developers On Wed, Mar 16, 2022 at 10:23 AM <phantom@zju.edu.cn> wrote: > > Here is a test case for this patch. I used to submit this bug on https://bugs.launchpad.net/qemu/+bug/1906516 > > sfence.vma will flush the tlb, so after this instruction, the translation block should be end. > The following code will only work in single step mode: > ``` > relocate: > li a0, OFFSET > > la t0, 1f > add t0, t0, a0 > csrw stvec, t0 > > la t0, early_pgtbl > srl t0, t0, PAGE_SHIFT > li t1, SATP_SV39 > or t0, t1, t0 > > csrw satp, t0 > 1: > sfence.vma > la t0, trap_s > csrw stvec, t0 > ret > ``` > > In this code, I want to relocate pc to virtual address with the OFFSET prefix. > Before writing to satp, pc run at physic address, stvec has been set to label 1 > with a virtual prefix and virtual address has been mapping in early_pgtbl, > after writing satp, qemu will throw a page fault, and pc will set to virtual > address of label 1. > > The problem is that, in this situation, the translation block will not end after > sfence.vma, and stvec will be set to trap_s, > > ``` > ---------------- > IN: > Priv: 1; Virt: 0 > 0x00000000800000dc: 00a080b3 add ra,ra,a0 > 0x00000000800000e0: 00007297 auipc t0,28672 # 0x800070e0 > 0x00000000800000e4: f2028293 addi t0,t0,-224 > 0x00000000800000e8: 00c2d293 srli t0,t0,12 > 0x00000000800000ec: fff0031b addiw t1,zero,-1 > 0x00000000800000f0: 03f31313 slli t1,t1,63 > 0x00000000800000f4: 005362b3 or t0,t1,t0 > 0x00000000800000f8: 18029073 csrrw zero,satp,t0 > > ---------------- > IN: > Priv: 1; Virt: 0 > 0x00000000800000fc: 12000073 sfence.vma zero,zero > 0x0000000080000100: 00000297 auipc t0,0 # 0x80000100 > 0x0000000080000104: 1c828293 addi t0,t0,456 > 0x0000000080000108: 10529073 csrrw zero,stvec,t0 > > riscv_raise_exception: 12 > riscv_raise_exception: 12 > riscv_raise_exception: 12 > riscv_raise_exception: 12 > ... > ``` > > So, the program will crash. And the program will only work in single step mode: > ``` > ---------------- > IN: > Priv: 1; Virt: 0 > 0x00000000800000f8: 18029073 csrrw zero,satp,t0 > > ---------------- > IN: > Priv: 1; Virt: 0 > 0x00000000800000fc: 12000073 sfence.vma zero,zero > > riscv_raise_exception: 12 > ---------------- > IN: > Priv: 1; Virt: 0 > 0xffffffff800000fc: 12000073 sfence.vma zero,zero > > ---------------- > IN: > Priv: 1; Virt: 0 > 0xffffffff80000100: 00000297 auipc t0,0 # 0xffffffff80000100 > > ``` > The pc will set to label 1, instead of trap_s. +qemu-dev and Alistair This is in for-next on Alistair's tree and fails to boot the kernel with the following error (found -d in_asm mode). Reverting the patch solves the issue. ---------------- IN: Priv: 1; Virt: 0 0x0000000080201040: 18051073 csrrw zero,satp,a0 ---------------- IN: Priv: 1; Virt: 0 0x0000000080201044: Address 0x80201044 is out of bounds. 0x0000000080201049: Address 0x80201049 is out of bounds. 0x000000008020104e: Address 0x8020104e is out of bounds. Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org ---------------- IN: Priv: 1; Virt: 0 0x0000000080201050: Address 0x80201050 is out of bounds. 0x0000000080201055: Address 0x80201055 is out of bounds. 0x000000008020105a: Address 0x8020105a is out of bounds. Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org ---------------- IN: Priv: 1; Virt: 0 0x000000008020105c: Address 0x8020105c is out of bounds. Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org -- Regards, Atish ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-29 23:15 ` Atish Patra @ 2022-03-30 6:15 ` Idan Horowitz 2022-03-30 7:28 ` Atish Patra 0 siblings, 1 reply; 9+ messages in thread From: Idan Horowitz @ 2022-03-30 6:15 UTC (permalink / raw) To: Atish Patra Cc: Alistair Francis, phantom, open list:RISC-V, qemu-devel@nongnu.org Developers On Wed, 30 Mar 2022 at 02:16, Atish Patra <atishp@atishpatra.org> wrote: > > This is in for-next on Alistair's tree and fails to boot the kernel > with the following error (found -d in_asm mode). > Reverting the patch solves the issue. > > ---------------- > IN: > Priv: 1; Virt: 0 > 0x0000000080201040: 18051073 csrrw zero,satp,a0 > > ---------------- > IN: > Priv: 1; Virt: 0 > 0x0000000080201044: Address 0x80201044 is out of bounds. > > 0x0000000080201049: Address 0x80201049 is out of bounds. > > 0x000000008020104e: Address 0x8020104e is out of bounds. > > Disassembler disagrees with translator over instruction decoding > Please report this to qemu-devel@nongnu.org > > ---------------- > IN: > Priv: 1; Virt: 0 > 0x0000000080201050: Address 0x80201050 is out of bounds. > > 0x0000000080201055: Address 0x80201055 is out of bounds. > > 0x000000008020105a: Address 0x8020105a is out of bounds. > > Disassembler disagrees with translator over instruction decoding > Please report this to qemu-devel@nongnu.org > > ---------------- > IN: > Priv: 1; Virt: 0 > 0x000000008020105c: Address 0x8020105c is out of bounds. > > Disassembler disagrees with translator over instruction decoding > Please report this to qemu-devel@nongnu.org > > -- > Regards, > Atish Do you have more specific information about which kernel image doesn't boot? The errors you're seeing simply mean that these addresses are not translated by the new address translation context set by the write to the satp. To be honest I don't immediately see how this could be caused by the patch, as it modifies the behaviour of the sfence.vma instruction, and there are none in your trace. Idan Horowitz ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-30 6:15 ` Idan Horowitz @ 2022-03-30 7:28 ` Atish Patra 2022-03-30 7:35 ` Idan Horowitz 0 siblings, 1 reply; 9+ messages in thread From: Atish Patra @ 2022-03-30 7:28 UTC (permalink / raw) To: Idan Horowitz Cc: Alistair Francis, phantom, open list:RISC-V, qemu-devel@nongnu.org Developers On Tue, Mar 29, 2022 at 11:15 PM Idan Horowitz <idan.horowitz@gmail.com> wrote: > > On Wed, 30 Mar 2022 at 02:16, Atish Patra <atishp@atishpatra.org> wrote: > > > > This is in for-next on Alistair's tree and fails to boot the kernel > > with the following error (found -d in_asm mode). > > Reverting the patch solves the issue. > > > > ---------------- > > IN: > > Priv: 1; Virt: 0 > > 0x0000000080201040: 18051073 csrrw zero,satp,a0 > > > > ---------------- > > IN: > > Priv: 1; Virt: 0 > > 0x0000000080201044: Address 0x80201044 is out of bounds. > > > > 0x0000000080201049: Address 0x80201049 is out of bounds. > > > > 0x000000008020104e: Address 0x8020104e is out of bounds. > > > > Disassembler disagrees with translator over instruction decoding > > Please report this to qemu-devel@nongnu.org > > > > ---------------- > > IN: > > Priv: 1; Virt: 0 > > 0x0000000080201050: Address 0x80201050 is out of bounds. > > > > 0x0000000080201055: Address 0x80201055 is out of bounds. > > > > 0x000000008020105a: Address 0x8020105a is out of bounds. > > > > Disassembler disagrees with translator over instruction decoding > > Please report this to qemu-devel@nongnu.org > > > > ---------------- > > IN: > > Priv: 1; Virt: 0 > > 0x000000008020105c: Address 0x8020105c is out of bounds. > > > > Disassembler disagrees with translator over instruction decoding > > Please report this to qemu-devel@nongnu.org > > > > -- > > Regards, > > Atish > > Do you have more specific information about which kernel image doesn't boot? I tested on v5.17 built from defconfig for rv64. > The errors you're seeing simply mean that these addresses are not > translated by the new address translation context set by the write to > the satp. > To be honest I don't immediately see how this could be caused by the > patch, as it modifies the behaviour of the sfence.vma instruction, and > there are none in your trace. > There was a sfence.vma. I just did not share the detailed trace before. Here is the kernel code executing sfence.vma https://elixir.bootlin.com/linux/v5.17/source/arch/riscv/kernel/head.S#L122 Here is the detailed trace that should provide more information. ------------------------------------------------------------------------------------------------------------------------------ ---------------- IN: Priv: 1; Virt: 0 0x0000000080a04664: 70e2 ld ra,56(sp) 0x0000000080a04666: 7442 ld s0,48(sp) 0x0000000080a04668: 74a2 ld s1,40(sp) 0x0000000080a0466a: 7902 ld s2,32(sp) 0x0000000080a0466c: 69e2 ld s3,24(sp) 0x0000000080a0466e: 6a42 ld s4,16(sp) 0x0000000080a04670: 6aa2 ld s5,8(sp) 0x0000000080a04672: 6121 addi sp,sp,64 0x0000000080a04674: 8082 ret ---------------- IN: Priv: 1; Virt: 0 0x0000000080201132: 00a05517 auipc a0,10506240 # 0x80c06132 0x0000000080201136: ece50513 addi a0,a0,-306 0x000000008020113a: ec7ff0ef jal ra,-314 # 0x80201000 ---------------- IN: Priv: 1; Virt: 0 0x0000000080201000: 00d95597 auipc a1,14241792 # 0x80f96000 0x0000000080201004: 38858593 addi a1,a1,904 0x0000000080201008: 658c ld a1,8(a1) 0x000000008020100a: fffff617 auipc a2,-4096 # 0x8020000a 0x000000008020100e: ff660613 addi a2,a2,-10 0x0000000080201012: 8d91 sub a1,a1,a2 0x0000000080201014: 90ae add ra,ra,a1 0x0000000080201016: 00000617 auipc a2,0 # 0x80201016 0x000000008020101a: 02e60613 addi a2,a2,46 0x000000008020101e: 962e add a2,a2,a1 0x0000000080201020: 10561073 csrrw zero,stvec,a2 ---------------- IN: Priv: 1; Virt: 0 0x0000000080201024: 00c55613 srli a2,a0,12 0x0000000080201028: 83018593 addi a1,gp,-2000 0x000000008020102c: 618c ld a1,0(a1) 0x000000008020102e: 8e4d or a2,a2,a1 0x0000000080201030: 010f7517 auipc a0,17788928 # 0x812f8030 0x0000000080201034: fd050513 addi a0,a0,-48 0x0000000080201038: 8131 srli a0,a0,12 0x000000008020103a: 8d4d or a0,a0,a1 0x000000008020103c: 12000073 sfence.vma zero,zero ---------------- IN: Priv: 1; Virt: 0 0x0000000080201040: 18051073 csrrw zero,satp,a0 ---------------- IN: Priv: 1; Virt: 0 0x0000000080201044: Address 0x80201044 is out of bounds. 0x0000000080201049: Address 0x80201049 is out of bounds. 0x000000008020104e: Address 0x8020104e is out of bounds. Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org ---------------- IN: Priv: 1; Virt: 0 0x0000000080201050: Address 0x80201050 is out of bounds. 0x0000000080201055: Address 0x80201055 is out of bounds. 0x000000008020105a: Address 0x8020105a is out of bounds. Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org ---------------- IN: Priv: 1; Virt: 0 0x000000008020105c: Address 0x8020105c is out of bounds. Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org ------------------------------------------------------------------------------------------------------------------------------ > Idan Horowitz -- Regards, Atish ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/riscv: Exit current TB after an sfence.vma 2022-03-30 7:28 ` Atish Patra @ 2022-03-30 7:35 ` Idan Horowitz 0 siblings, 0 replies; 9+ messages in thread From: Idan Horowitz @ 2022-03-30 7:35 UTC (permalink / raw) To: Atish Patra Cc: Alistair Francis, phantom, open list:RISC-V, qemu-devel@nongnu.org Developers On Wed, 30 Mar 2022 at 10:28, Atish Patra <atishp@atishpatra.org> wrote: > > I tested on v5.17 built from defconfig for rv64. > > Here is the kernel code executing sfence.vma > https://elixir.bootlin.com/linux/v5.17/source/arch/riscv/kernel/head.S#L122 > I believe this is a kernel bug and not a QEMU one. They perform a write to the SATP with the same ASID as the one used before (0) and then expect it to be used, without performing an sfence.vma following it. This was exposed by my change, as previously the write to the satp was performed in the same TB block as the sfence.vma *before it*, which meant the TLB was not filled in between the previous sfence and the write to SATP following it. I was able to reproduce the issue with the Fedora Rawhide image in the wiki, and I was able to resolve it by artificially forcing a TLB flush following all writes to SATP. I think the correct course of action here is to: 1. Report the issue to the linux kernel mailing list and/or contribute a patch that adds said missing sfence.vma following the SATP write. (Atish: Are you able to test if adding an sfence.vma in your kernel build fixes the issue?) 2. Restore the patch > > -- > Regards, > Atish Idan Horowitz ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-03-30 7:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-15 19:23 [PATCH] target/riscv: Exit current TB after an sfence.vma Idan Horowitz
2022-03-15 19:37 ` Richard Henderson
2022-03-15 22:52 ` Alistair Francis
2022-03-15 23:42 ` Alistair Francis
2022-03-30 6:09 ` Alistair Francis
[not found] <7f383fc2.81a2.17f93c0dad7.Coremail.phantom@zju.edu.cn>
2022-03-29 23:15 ` Atish Patra
2022-03-30 6:15 ` Idan Horowitz
2022-03-30 7:28 ` Atish Patra
2022-03-30 7:35 ` Idan Horowitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).