* [PATCH 0/6] accel/tcg: Always require can_do_io (#1866)
@ 2023-09-14 17:44 Richard Henderson
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
` (5 more replies)
0 siblings, 6 replies; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
The problem exposed by the fix for #1826 (et al) is that the TB that
contains the i/o instruction that alters the address space continues
on to issue other i/o instructions.
Since #1826 deferred the update to the address space, these subsequent
i/o instructions do not reference the correct address space, and things
go awry from there.
Ideally we would treat changes to the address space as specially, but
that knowledge is buried quite far down in the device models. We don't
find out that such a change is coming until quite late such that we
cannot undo all of the other side effects and start over.
The only alternative would seem to be to treat all i/o pesimistically
and end the TB after any i/o, exactly as we do for icount.
I'm not pleased about this, because an eyeball of avocado times
suggests a slowdown. No doubt caused by most i/o having to go
through cpu_io_recompile.
I begin to wonder if #1826 should be solved differently, like *not*
caching MemoryRegionSections within the cpu tlb, and looking up the
physical address within the address space during the i/o itself.
But even that seems like it would work only for the more common case
where the address space reorg only changes devices. For the odd case
where an address space reorg changes RAM, we still have the result
of the phys addr lookup cached via the adjustment to host memory.
So this seems like the most reasonable solution.
Follow-up patches could optimize setting of can_do_io, and replace
previous TranslationBlocks so that we only go through cpu_io_recompile
once for a particular bit of guest code, rather than every single time.
r~
Richard Henderson (6):
accel/tcg: Avoid load of icount_decr if unused
accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
accel/tcg: Track current value of can_do_io in the TB
accel/tcg: Improve setting of can_do_io at start of TB
accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
accel/tcg: Always require can_do_io
include/exec/translator.h | 2 ++
accel/tcg/cpu-exec.c | 2 +-
accel/tcg/tb-maint.c | 6 ++--
accel/tcg/translator.c | 72 ++++++++++++++++++-------------------
target/mips/tcg/translate.c | 1 -
5 files changed, 41 insertions(+), 42 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:23 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop Richard Henderson
` (4 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
With CF_NOIRQ and without !CF_USE_ICOUNT, the load isn't used.
Avoid emitting it.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 1a6a5448c8..a3983019a5 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -49,12 +49,15 @@ bool translator_io_start(DisasContextBase *db)
static TCGOp *gen_tb_start(uint32_t cflags)
{
- TCGv_i32 count = tcg_temp_new_i32();
+ TCGv_i32 count = NULL;
TCGOp *icount_start_insn = NULL;
- tcg_gen_ld_i32(count, cpu_env,
- offsetof(ArchCPU, neg.icount_decr.u32) -
- offsetof(ArchCPU, env));
+ if ((cflags & CF_USE_ICOUNT) || !(cflags & CF_NOIRQ)) {
+ count = tcg_temp_new_i32();
+ tcg_gen_ld_i32(count, cpu_env,
+ offsetof(ArchCPU, neg.icount_decr.u32) -
+ offsetof(ArchCPU, env));
+ }
if (cflags & CF_USE_ICOUNT) {
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:26 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
` (3 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
The condition checked is loop invariant; check it only once.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index a3983019a5..b6ab9f3d33 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -158,7 +158,13 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
- plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
+ if (cflags & CF_MEMI_ONLY) {
+ /* We should only see CF_MEMI_ONLY for io_recompile. */
+ assert(cflags & CF_LAST_IO);
+ plugin_enabled = plugin_gen_tb_start(cpu, db, true);
+ } else {
+ plugin_enabled = plugin_gen_tb_start(cpu, db, false);
+ }
while (true) {
*max_insns = ++db->num_insns;
@@ -176,12 +182,8 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
gen_io_start();
- ops->translate_insn(db, cpu);
- } else {
- /* we should only see CF_MEMI_ONLY for io_recompile */
- tcg_debug_assert(!(cflags & CF_MEMI_ONLY));
- ops->translate_insn(db, cpu);
}
+ ops->translate_insn(db, cpu);
/*
* We can't instrument after instructions that change control
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
2023-09-14 17:44 ` [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB Richard Henderson
` (2 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Simplify translator_io_start by recording the current
known value of can_do_io within DisasContextBase.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/translator.h | 2 ++
accel/tcg/translator.c | 31 ++++++++++++++-----------------
2 files changed, 16 insertions(+), 17 deletions(-)
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 4e17c4f401..9d9e980819 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -72,6 +72,7 @@ typedef enum DisasJumpType {
* @num_insns: Number of translated instructions (including current).
* @max_insns: Maximum number of instructions to be translated in this TB.
* @singlestep_enabled: "Hardware" single stepping enabled.
+ * @saved_can_do_io: Known value of cpu->neg.can_do_io, or -1 for unknown.
*
* Architecture-agnostic disassembly context.
*/
@@ -83,6 +84,7 @@ typedef struct DisasContextBase {
int num_insns;
int max_insns;
bool singlestep_enabled;
+ int8_t saved_can_do_io;
void *host_addr[2];
} DisasContextBase;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index b6ab9f3d33..850d82e26f 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -16,11 +16,14 @@
#include "tcg/tcg-op-common.h"
#include "internal.h"
-static void gen_io_start(void)
+static void set_can_do_io(DisasContextBase *db, bool val)
{
- tcg_gen_st_i32(tcg_constant_i32(1), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ if (db->saved_can_do_io != val) {
+ db->saved_can_do_io = val;
+ tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
+ offsetof(ArchCPU, parent_obj.can_do_io) -
+ offsetof(ArchCPU, env));
+ }
}
bool translator_io_start(DisasContextBase *db)
@@ -30,12 +33,8 @@ bool translator_io_start(DisasContextBase *db)
if (!(cflags & CF_USE_ICOUNT)) {
return false;
}
- if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
- /* Already started in translator_loop. */
- return true;
- }
- gen_io_start();
+ set_can_do_io(db, true);
/*
* Ensure that this instruction will be the last in the TB.
@@ -47,7 +46,7 @@ bool translator_io_start(DisasContextBase *db)
return true;
}
-static TCGOp *gen_tb_start(uint32_t cflags)
+static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
{
TCGv_i32 count = NULL;
TCGOp *icount_start_insn = NULL;
@@ -91,12 +90,9 @@ static TCGOp *gen_tb_start(uint32_t cflags)
* cpu->can_do_io is cleared automatically here at the beginning of
* each translation block. The cost is minimal and only paid for
* -icount, plus it would be very easy to forget doing it in the
- * translator. Doing it here means we don't need a gen_io_end() to
- * go with gen_io_start().
+ * translator.
*/
- tcg_gen_st_i32(tcg_constant_i32(0), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ set_can_do_io(db, false);
}
return icount_start_insn;
@@ -147,6 +143,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
db->num_insns = 0;
db->max_insns = *max_insns;
db->singlestep_enabled = cflags & CF_SINGLE_STEP;
+ db->saved_can_do_io = -1;
db->host_addr[0] = host_pc;
db->host_addr[1] = NULL;
@@ -154,7 +151,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Start translating. */
- icount_start_insn = gen_tb_start(cflags);
+ icount_start_insn = gen_tb_start(db, cflags);
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
@@ -181,7 +178,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
the next instruction. */
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
- gen_io_start();
+ set_can_do_io(db, true);
}
ops->translate_insn(db, cpu);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
` (2 preceding siblings ...)
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:47 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ Richard Henderson
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Initialize can_do_io to true if this the TB has CF_LAST_IO
and will consist of a single instruction. This avoids a
set to 0 followed immediately by a set to 1.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 850d82e26f..dd507cd471 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -87,12 +87,12 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
offsetof(ArchCPU, neg.icount_decr.u16.low) -
offsetof(ArchCPU, env));
/*
- * cpu->can_do_io is cleared automatically here at the beginning of
+ * cpu->can_do_io is set automatically here at the beginning of
* each translation block. The cost is minimal and only paid for
* -icount, plus it would be very easy to forget doing it in the
* translator.
*/
- set_can_do_io(db, false);
+ set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
}
return icount_start_insn;
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
` (3 preceding siblings ...)
2023-09-14 17:44 ` [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-19 9:26 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Without this we can get see loops through cpu_io_recompile,
in which the cpu makes no progress.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cpu-exec.c | 2 +-
accel/tcg/tb-maint.c | 6 ++++--
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index e2c494e75e..c724e8b6f1 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -720,7 +720,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
&& cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra == 0) {
/* Execute just one insn to trigger exception pending in the log */
cpu->cflags_next_tb = (curr_cflags(cpu) & ~CF_USE_ICOUNT)
- | CF_NOIRQ | 1;
+ | CF_LAST_IO | CF_NOIRQ | 1;
}
#endif
return false;
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 32ae8af61c..0c3e227409 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -1083,7 +1083,8 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
if (current_tb_modified) {
/* Force execution of one insn next time. */
CPUState *cpu = current_cpu;
- cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
+ cpu->cflags_next_tb =
+ 1 | CF_LAST_IO | CF_NOIRQ | curr_cflags(current_cpu);
return true;
}
return false;
@@ -1153,7 +1154,8 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
if (current_tb_modified) {
page_collection_unlock(pages);
/* Force execution of one insn next time. */
- current_cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
+ current_cpu->cflags_next_tb =
+ 1 | CF_LAST_IO | CF_NOIRQ | curr_cflags(current_cpu);
mmap_unlock();
cpu_loop_exit_noexc(current_cpu);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/6] accel/tcg: Always require can_do_io
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
` (4 preceding siblings ...)
2023-09-14 17:44 ` [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-19 9:37 ` Philippe Mathieu-Daudé
2023-10-24 9:50 ` Clément Chigot
5 siblings, 2 replies; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Require i/o as the last insn of a TranslationBlock always,
not only with icount. This is required for i/o that alters
the address space, such as a pci config space write.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 20 +++++++-------------
target/mips/tcg/translate.c | 1 -
2 files changed, 7 insertions(+), 14 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index dd507cd471..358214d526 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -28,12 +28,6 @@ static void set_can_do_io(DisasContextBase *db, bool val)
bool translator_io_start(DisasContextBase *db)
{
- uint32_t cflags = tb_cflags(db->tb);
-
- if (!(cflags & CF_USE_ICOUNT)) {
- return false;
- }
-
set_can_do_io(db, true);
/*
@@ -86,15 +80,15 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
tcg_gen_st16_i32(count, cpu_env,
offsetof(ArchCPU, neg.icount_decr.u16.low) -
offsetof(ArchCPU, env));
- /*
- * cpu->can_do_io is set automatically here at the beginning of
- * each translation block. The cost is minimal and only paid for
- * -icount, plus it would be very easy to forget doing it in the
- * translator.
- */
- set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
}
+ /*
+ * cpu->can_do_io is set automatically here at the beginning of
+ * each translation block. The cost is minimal, plus it would be
+ * very easy to forget doing it in the translator.
+ */
+ set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
+
return icount_start_insn;
}
diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 9bb40f1849..593fc80458 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -11212,7 +11212,6 @@ static void gen_branch(DisasContext *ctx, int insn_bytes)
/* Branches completion */
clear_branch_hflags(ctx);
ctx->base.is_jmp = DISAS_NORETURN;
- /* FIXME: Need to clear can_do_io. */
switch (proc_hflags & MIPS_HFLAG_BMASK_BASE) {
case MIPS_HFLAG_FBNSLOT:
gen_goto_tb(ctx, 0, ctx->base.pc_next + insn_bytes);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
@ 2023-09-15 9:23 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:23 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
On 14/9/23 19:44, Richard Henderson wrote:
> With CF_NOIRQ and without !CF_USE_ICOUNT, the load isn't used.
> Avoid emitting it.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
2023-09-14 17:44 ` [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop Richard Henderson
@ 2023-09-15 9:26 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:26 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
On 14/9/23 19:44, Richard Henderson wrote:
> The condition checked is loop invariant; check it only once.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io()
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
@ 2023-09-15 9:41 ` Philippe Mathieu-Daudé
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
2023-09-15 9:42 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
0 siblings, 2 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:41 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
[PMD: Split patch in 2, extracting set_can_do_io() first]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
accel/tcg/translator.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index b6ab9f3d33..cebf0a84cc 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -16,9 +16,9 @@
#include "tcg/tcg-op-common.h"
#include "internal.h"
-static void gen_io_start(void)
+static void set_can_do_io(DisasContextBase *db, bool val)
{
- tcg_gen_st_i32(tcg_constant_i32(1), cpu_env,
+ tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
offsetof(ArchCPU, parent_obj.can_do_io) -
offsetof(ArchCPU, env));
}
@@ -35,7 +35,7 @@ bool translator_io_start(DisasContextBase *db)
return true;
}
- gen_io_start();
+ set_can_do_io(db, true);
/*
* Ensure that this instruction will be the last in the TB.
@@ -47,7 +47,7 @@ bool translator_io_start(DisasContextBase *db)
return true;
}
-static TCGOp *gen_tb_start(uint32_t cflags)
+static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
{
TCGv_i32 count = NULL;
TCGOp *icount_start_insn = NULL;
@@ -91,12 +91,9 @@ static TCGOp *gen_tb_start(uint32_t cflags)
* cpu->can_do_io is cleared automatically here at the beginning of
* each translation block. The cost is minimal and only paid for
* -icount, plus it would be very easy to forget doing it in the
- * translator. Doing it here means we don't need a gen_io_end() to
- * go with gen_io_start().
+ * translator.
*/
- tcg_gen_st_i32(tcg_constant_i32(0), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ set_can_do_io(db, false);
}
return icount_start_insn;
@@ -154,7 +151,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Start translating. */
- icount_start_insn = gen_tb_start(cflags);
+ icount_start_insn = gen_tb_start(db, cflags);
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
@@ -181,7 +178,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
the next instruction. */
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
- gen_io_start();
+ set_can_do_io(db, true);
}
ops->translate_insn(db, cpu);
--
2.41.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
@ 2023-09-15 9:41 ` Philippe Mathieu-Daudé
2023-09-15 9:44 ` Philippe Mathieu-Daudé
2023-09-15 9:42 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
1 sibling, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:41 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé
From: Richard Henderson <richard.henderson@linaro.org>
Simplify translator_io_start by recording the current
known value of can_do_io within DisasContextBase.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
[PMD: Split patch in 2, extracting set_can_do_io() first]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
include/exec/translator.h | 2 ++
accel/tcg/translator.c | 14 +++++++-------
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 4e17c4f401..9d9e980819 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -72,6 +72,7 @@ typedef enum DisasJumpType {
* @num_insns: Number of translated instructions (including current).
* @max_insns: Maximum number of instructions to be translated in this TB.
* @singlestep_enabled: "Hardware" single stepping enabled.
+ * @saved_can_do_io: Known value of cpu->neg.can_do_io, or -1 for unknown.
*
* Architecture-agnostic disassembly context.
*/
@@ -83,6 +84,7 @@ typedef struct DisasContextBase {
int num_insns;
int max_insns;
bool singlestep_enabled;
+ int8_t saved_can_do_io;
void *host_addr[2];
} DisasContextBase;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index cebf0a84cc..850d82e26f 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -18,9 +18,12 @@
static void set_can_do_io(DisasContextBase *db, bool val)
{
- tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ if (db->saved_can_do_io != val) {
+ db->saved_can_do_io = val;
+ tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
+ offsetof(ArchCPU, parent_obj.can_do_io) -
+ offsetof(ArchCPU, env));
+ }
}
bool translator_io_start(DisasContextBase *db)
@@ -30,10 +33,6 @@ bool translator_io_start(DisasContextBase *db)
if (!(cflags & CF_USE_ICOUNT)) {
return false;
}
- if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
- /* Already started in translator_loop. */
- return true;
- }
set_can_do_io(db, true);
@@ -144,6 +143,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
db->num_insns = 0;
db->max_insns = *max_insns;
db->singlestep_enabled = cflags & CF_SINGLE_STEP;
+ db->saved_can_do_io = -1;
db->host_addr[0] = host_pc;
db->host_addr[1] = NULL;
--
2.41.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io()
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
@ 2023-09-15 9:42 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:42 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson
On 15/9/23 11:41, Philippe Mathieu-Daudé wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
> [PMD: Split patch in 2, extracting set_can_do_io() first]
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> accel/tcg/translator.c | 19 ++++++++-----------
> 1 file changed, 8 insertions(+), 11 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
@ 2023-09-15 9:44 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:44 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson
On 15/9/23 11:41, Philippe Mathieu-Daudé wrote:
> From: Richard Henderson <richard.henderson@linaro.org>
>
> Simplify translator_io_start by recording the current
> known value of can_do_io within DisasContextBase.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
> [PMD: Split patch in 2, extracting set_can_do_io() first]
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> include/exec/translator.h | 2 ++
> accel/tcg/translator.c | 14 +++++++-------
> 2 files changed, 9 insertions(+), 7 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB
2023-09-14 17:44 ` [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB Richard Henderson
@ 2023-09-15 9:47 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:47 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
On 14/9/23 19:44, Richard Henderson wrote:
> Initialize can_do_io to true if this the TB has CF_LAST_IO
> and will consist of a single instruction. This avoids a
> set to 0 followed immediately by a set to 1.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
2023-09-14 17:44 ` [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ Richard Henderson
@ 2023-09-19 9:26 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-19 9:26 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
On 14/9/23 19:44, Richard Henderson wrote:
> Without this we can get see loops through cpu_io_recompile,
> in which the cpu makes no progress.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 2 +-
> accel/tcg/tb-maint.c | 6 ++++--
> 2 files changed, 5 insertions(+), 3 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
@ 2023-09-19 9:37 ` Philippe Mathieu-Daudé
2023-10-24 9:50 ` Clément Chigot
1 sibling, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-19 9:37 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
On 14/9/23 19:44, Richard Henderson wrote:
> Require i/o as the last insn of a TranslationBlock always,
> not only with icount. This is required for i/o that alters
> the address space, such as a pci config space write.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 20 +++++++-------------
> target/mips/tcg/translate.c | 1 -
> 2 files changed, 7 insertions(+), 14 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
2023-09-19 9:37 ` Philippe Mathieu-Daudé
@ 2023-10-24 9:50 ` Clément Chigot
2023-10-26 0:44 ` Richard Henderson
1 sibling, 1 reply; 19+ messages in thread
From: Clément Chigot @ 2023-10-24 9:50 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, philmd
Hi Richard,
This commit has broken some of our internal bareboard testing on
Risc-V 64. At some point in our programs, there is an AMOSWAP (=
atomic swap) instruction on I/O. But since this commit, can_do_io is
set to false triggering an infinite loop.
IIUC the doc (cf [1]), atomic operations on I/O are allowed.
I think there is a CF_LAST_IO flag missing somewhere to allow it, but
I'm not sure where this should be. Do you have any ideas ?
Sadly I cannot provide a reproducer that easily, mainly because our
microchip has a few patches not yet merged making our binaries not
running on the upstream master.
But here is a bit of the in_asm backtrace:
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000eb4: 1141 addi sp,sp,-16
| 0x80000eb6: 0c0027b7 lui a5,49154
| 0x80000eba: e406 sd ra,8(sp)
| 0x80000ebc: 00010597 auipc a1,16
# 0x80010ebc
| 0x80000ec0: 47458593 addi a1,a1,1140
| 0x80000ec4: f3ffe637 lui a2,-49154
| 0x80000ec8: 01878693 addi a3,a5,24
| 0x80000ecc: 00f58733 add a4,a1,a5
| 0x80000ed0: 9732 add a4,a4,a2
| 0x80000ed2: 4318 lw a4,0(a4)
| 0x80000ed4: 2701 sext.w a4,a4
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| 0x80000eda: 0791 addi a5,a5,4
| 0x80000edc: fed798e3 bne a5,a3,-16
# 0x80000ecc
|
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| * Freeze *
a5 (= 0xc002000) above is targeting an address inside sifive_plic HW.
Note IIUC the doc (cf [1]), atomic operations on I/O are allowed.
[1] https://github.com/riscv/riscv-isa-manual/blob/main/src/a-st-ext.adoc#atomic-memory-operations
Thanks in advance.
On Thu, Sep 14, 2023 at 7:45 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Require i/o as the last insn of a TranslationBlock always,
> not only with icount. This is required for i/o that alters
> the address space, such as a pci config space write.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 20 +++++++-------------
> target/mips/tcg/translate.c | 1 -
> 2 files changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index dd507cd471..358214d526 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -28,12 +28,6 @@ static void set_can_do_io(DisasContextBase *db, bool val)
>
> bool translator_io_start(DisasContextBase *db)
> {
> - uint32_t cflags = tb_cflags(db->tb);
> -
> - if (!(cflags & CF_USE_ICOUNT)) {
> - return false;
> - }
> -
> set_can_do_io(db, true);
>
> /*
> @@ -86,15 +80,15 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
> tcg_gen_st16_i32(count, cpu_env,
> offsetof(ArchCPU, neg.icount_decr.u16.low) -
> offsetof(ArchCPU, env));
> - /*
> - * cpu->can_do_io is set automatically here at the beginning of
> - * each translation block. The cost is minimal and only paid for
> - * -icount, plus it would be very easy to forget doing it in the
> - * translator.
> - */
> - set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
> }
>
> + /*
> + * cpu->can_do_io is set automatically here at the beginning of
> + * each translation block. The cost is minimal, plus it would be
> + * very easy to forget doing it in the translator.
> + */
> + set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
> +
> return icount_start_insn;
> }
>
> diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
> index 9bb40f1849..593fc80458 100644
> --- a/target/mips/tcg/translate.c
> +++ b/target/mips/tcg/translate.c
> @@ -11212,7 +11212,6 @@ static void gen_branch(DisasContext *ctx, int insn_bytes)
> /* Branches completion */
> clear_branch_hflags(ctx);
> ctx->base.is_jmp = DISAS_NORETURN;
> - /* FIXME: Need to clear can_do_io. */
> switch (proc_hflags & MIPS_HFLAG_BMASK_BASE) {
> case MIPS_HFLAG_FBNSLOT:
> gen_goto_tb(ctx, 0, ctx->base.pc_next + insn_bytes);
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-10-24 9:50 ` Clément Chigot
@ 2023-10-26 0:44 ` Richard Henderson
2023-10-26 8:10 ` Clément Chigot
0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-10-26 0:44 UTC (permalink / raw)
To: Clément Chigot; +Cc: qemu-devel, philmd
On 10/24/23 02:50, Clément Chigot wrote:
> Hi Richard,
>
> This commit has broken some of our internal bareboard testing on
> Risc-V 64. At some point in our programs, there is an AMOSWAP (=
> atomic swap) instruction on I/O. But since this commit, can_do_io is
> set to false triggering an infinite loop.
> IIUC the doc (cf [1]), atomic operations on I/O are allowed.
>
> I think there is a CF_LAST_IO flag missing somewhere to allow it, but
> I'm not sure where this should be. Do you have any ideas ?
>
> Sadly I cannot provide a reproducer that easily, mainly because our
> microchip has a few patches not yet merged making our binaries not
> running on the upstream master.
> But here is a bit of the in_asm backtrace:
>
> | IN: system__bb__riscv_plic__initialize
> | Priv: 3; Virt: 0
> | 0x80000eb4: 1141 addi sp,sp,-16
> | 0x80000eb6: 0c0027b7 lui a5,49154
> | 0x80000eba: e406 sd ra,8(sp)
> | 0x80000ebc: 00010597 auipc a1,16
> # 0x80010ebc
> | 0x80000ec0: 47458593 addi a1,a1,1140
> | 0x80000ec4: f3ffe637 lui a2,-49154
> | 0x80000ec8: 01878693 addi a3,a5,24
> | 0x80000ecc: 00f58733 add a4,a1,a5
> | 0x80000ed0: 9732 add a4,a4,a2
> | 0x80000ed2: 4318 lw a4,0(a4)
> | 0x80000ed4: 2701 sext.w a4,a4
> | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> | 0x80000eda: 0791 addi a5,a5,4
> | 0x80000edc: fed798e3 bne a5,a3,-16
> # 0x80000ecc
> |
> | ----------------
> | IN: system__bb__riscv_plic__initialize
> | Priv: 3; Virt: 0
> | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> |
> | ----------------
> | IN: system__bb__riscv_plic__initialize
> | Priv: 3; Virt: 0
> | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> | * Freeze *
I would expect two translations:
(1) with the original TB, aborts execution on !can_do_io.
(2) with the second TB, we get further into the actual execution and abort execution on
TLB_MMIO.
(3) with the third TB, we clear CF_PARALLEL and execute under cpu_exec_step_atomic.
Both 2 and 3 should have had CF_LAST_IO set.
You can verify this with '-d exec' output.
As a trivial example from qemu-system-alpha bios startup:
> Trace 0: 0x7f2584008380 [00000000/fffffc0000003ee4/01000000/ff000000] uart_init_line
> cpu_io_recompile: rewound execution of TB to fffffc0000003ee4
> ----------------
> IN: uart_init_line
> 0xfffffc0000003f20: stb t8,0(t6)
>
> Trace 0: 0x7f2584008a00 [00000000/fffffc0000003f20/01000000/ff018001] uart_init_line
Note that the final "/" field is cflags. The first "Trace" corresponds to (1), where the
store is in the middle of the TB. You can see the io_recompile abort, then the second
"Trace" contains {CF_COUNT=1, CF_LAST_IO, CF_MEMI_ONLY}.
In the short term, try adding CF_LAST_IO to cflags in cpu_exec_step_atomic.
I think probably the logic of CF_LAST_IO should always be applied now, since can_do_io is
always live, and thus the flag itself should go away.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-10-26 0:44 ` Richard Henderson
@ 2023-10-26 8:10 ` Clément Chigot
0 siblings, 0 replies; 19+ messages in thread
From: Clément Chigot @ 2023-10-26 8:10 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, philmd
On Thu, Oct 26, 2023 at 2:44 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 10/24/23 02:50, Clément Chigot wrote:
> > Hi Richard,
> >
> > This commit has broken some of our internal bareboard testing on
> > Risc-V 64. At some point in our programs, there is an AMOSWAP (=
> > atomic swap) instruction on I/O. But since this commit, can_do_io is
> > set to false triggering an infinite loop.
> > IIUC the doc (cf [1]), atomic operations on I/O are allowed.
> >
> > I think there is a CF_LAST_IO flag missing somewhere to allow it, but
> > I'm not sure where this should be. Do you have any ideas ?
> >
> > Sadly I cannot provide a reproducer that easily, mainly because our
> > microchip has a few patches not yet merged making our binaries not
> > running on the upstream master.
> > But here is a bit of the in_asm backtrace:
> >
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000eb4: 1141 addi sp,sp,-16
> > | 0x80000eb6: 0c0027b7 lui a5,49154
> > | 0x80000eba: e406 sd ra,8(sp)
> > | 0x80000ebc: 00010597 auipc a1,16
> > # 0x80010ebc
> > | 0x80000ec0: 47458593 addi a1,a1,1140
> > | 0x80000ec4: f3ffe637 lui a2,-49154
> > | 0x80000ec8: 01878693 addi a3,a5,24
> > | 0x80000ecc: 00f58733 add a4,a1,a5
> > | 0x80000ed0: 9732 add a4,a4,a2
> > | 0x80000ed2: 4318 lw a4,0(a4)
> > | 0x80000ed4: 2701 sext.w a4,a4
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > | 0x80000eda: 0791 addi a5,a5,4
> > | 0x80000edc: fed798e3 bne a5,a3,-16
> > # 0x80000ecc
> > |
> > | ----------------
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > |
> > | ----------------
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > | * Freeze *
>
> I would expect two translations:
>
> (1) with the original TB, aborts execution on !can_do_io.
> (2) with the second TB, we get further into the actual execution and abort execution on
> TLB_MMIO.
> (3) with the third TB, we clear CF_PARALLEL and execute under cpu_exec_step_atomic.
>
> Both 2 and 3 should have had CF_LAST_IO set.
> You can verify this with '-d exec' output.
>
> As a trivial example from qemu-system-alpha bios startup:
>
> > Trace 0: 0x7f2584008380 [00000000/fffffc0000003ee4/01000000/ff000000] uart_init_line
> > cpu_io_recompile: rewound execution of TB to fffffc0000003ee4
> > ----------------
> > IN: uart_init_line
> > 0xfffffc0000003f20: stb t8,0(t6)
> >
> > Trace 0: 0x7f2584008a00 [00000000/fffffc0000003f20/01000000/ff018001] uart_init_line
>
> Note that the final "/" field is cflags. The first "Trace" corresponds to (1), where the
> store is in the middle of the TB. You can see the io_recompile abort, then the second
> "Trace" contains {CF_COUNT=1, CF_LAST_IO, CF_MEMI_ONLY}.
With the exec, it's indeed a bit clearer.
I do have a new translation made by cpu_io_recompile which sets
CF_LAST_IO (2). But afterward, it somehow loops back to the previous
translation which doesn't have CF_LAST_IO which calls cpu_io_recompile
again, etc. This triggers an infinite loop between these two
translations
| ----------------
| IN: system__bb__riscv_plic__initialize
| 0x80000eb4: 1141 addi sp,sp,-16
| ...
| 0x80000ed4: 2701 sext.w a4,a4
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| 0x80000eda: 0791 addi a5,a5,4
| 0x80000edc: fed798e3 bne a5,a3,-16
# 0x80000ecc
|
| Linking TBs 0x7f0d3199db40 index 0 -> 0x7f0d3199dd00
| Trace 1: 0x7f0d3199dd00
[00000000/0000000080000eb4/0b02401b/ff280000]
system__bb__riscv_plic__initialize
(1) First exec of amoswap without CF_LAST_IO.
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
(2) Second exec with CF_LAST_IO.
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| Trace 1: 0x7f0d3199e140
[00000000/0000000080000ed6/0b02401b/ff298001]
system__bb__riscv_plic__initialize
Loop back (1) and start the infinite loop between (1) and (2).
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
| Trace 1: 0x7f0d3199e140
[00000000/0000000080000ed6/0b02401b/ff298001]
system__bb__riscv_plic__initialize
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
I'll continue to investigate. Even if the workaround you provided
works, there is something weird happening here.
> In the short term, try adding CF_LAST_IO to cflags in cpu_exec_step_atomic.
Thanks for that workaround.
Thanks,
Clément
> I think probably the logic of CF_LAST_IO should always be applied now, since can_do_io is
> always live, and thus the flag itself should go away.
>
>
> r~
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2023-10-26 8:11 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
2023-09-15 9:23 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop Richard Henderson
2023-09-15 9:26 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
2023-09-15 9:44 ` Philippe Mathieu-Daudé
2023-09-15 9:42 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB Richard Henderson
2023-09-15 9:47 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ Richard Henderson
2023-09-19 9:26 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
2023-09-19 9:37 ` Philippe Mathieu-Daudé
2023-10-24 9:50 ` Clément Chigot
2023-10-26 0:44 ` Richard Henderson
2023-10-26 8:10 ` Clément Chigot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).