* [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:23 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop Richard Henderson
` (4 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
With CF_NOIRQ and without !CF_USE_ICOUNT, the load isn't used.
Avoid emitting it.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 1a6a5448c8..a3983019a5 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -49,12 +49,15 @@ bool translator_io_start(DisasContextBase *db)
static TCGOp *gen_tb_start(uint32_t cflags)
{
- TCGv_i32 count = tcg_temp_new_i32();
+ TCGv_i32 count = NULL;
TCGOp *icount_start_insn = NULL;
- tcg_gen_ld_i32(count, cpu_env,
- offsetof(ArchCPU, neg.icount_decr.u32) -
- offsetof(ArchCPU, env));
+ if ((cflags & CF_USE_ICOUNT) || !(cflags & CF_NOIRQ)) {
+ count = tcg_temp_new_i32();
+ tcg_gen_ld_i32(count, cpu_env,
+ offsetof(ArchCPU, neg.icount_decr.u32) -
+ offsetof(ArchCPU, env));
+ }
if (cflags & CF_USE_ICOUNT) {
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:26 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
` (3 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
The condition checked is loop invariant; check it only once.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index a3983019a5..b6ab9f3d33 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -158,7 +158,13 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
- plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
+ if (cflags & CF_MEMI_ONLY) {
+ /* We should only see CF_MEMI_ONLY for io_recompile. */
+ assert(cflags & CF_LAST_IO);
+ plugin_enabled = plugin_gen_tb_start(cpu, db, true);
+ } else {
+ plugin_enabled = plugin_gen_tb_start(cpu, db, false);
+ }
while (true) {
*max_insns = ++db->num_insns;
@@ -176,12 +182,8 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
gen_io_start();
- ops->translate_insn(db, cpu);
- } else {
- /* we should only see CF_MEMI_ONLY for io_recompile */
- tcg_debug_assert(!(cflags & CF_MEMI_ONLY));
- ops->translate_insn(db, cpu);
}
+ ops->translate_insn(db, cpu);
/*
* We can't instrument after instructions that change control
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
2023-09-14 17:44 ` [PATCH 1/6] accel/tcg: Avoid load of icount_decr if unused Richard Henderson
2023-09-14 17:44 ` [PATCH 2/6] accel/tcg: Hoist CF_MEMI_ONLY check outside translation loop Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB Richard Henderson
` (2 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Simplify translator_io_start by recording the current
known value of can_do_io within DisasContextBase.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/translator.h | 2 ++
accel/tcg/translator.c | 31 ++++++++++++++-----------------
2 files changed, 16 insertions(+), 17 deletions(-)
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 4e17c4f401..9d9e980819 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -72,6 +72,7 @@ typedef enum DisasJumpType {
* @num_insns: Number of translated instructions (including current).
* @max_insns: Maximum number of instructions to be translated in this TB.
* @singlestep_enabled: "Hardware" single stepping enabled.
+ * @saved_can_do_io: Known value of cpu->neg.can_do_io, or -1 for unknown.
*
* Architecture-agnostic disassembly context.
*/
@@ -83,6 +84,7 @@ typedef struct DisasContextBase {
int num_insns;
int max_insns;
bool singlestep_enabled;
+ int8_t saved_can_do_io;
void *host_addr[2];
} DisasContextBase;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index b6ab9f3d33..850d82e26f 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -16,11 +16,14 @@
#include "tcg/tcg-op-common.h"
#include "internal.h"
-static void gen_io_start(void)
+static void set_can_do_io(DisasContextBase *db, bool val)
{
- tcg_gen_st_i32(tcg_constant_i32(1), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ if (db->saved_can_do_io != val) {
+ db->saved_can_do_io = val;
+ tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
+ offsetof(ArchCPU, parent_obj.can_do_io) -
+ offsetof(ArchCPU, env));
+ }
}
bool translator_io_start(DisasContextBase *db)
@@ -30,12 +33,8 @@ bool translator_io_start(DisasContextBase *db)
if (!(cflags & CF_USE_ICOUNT)) {
return false;
}
- if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
- /* Already started in translator_loop. */
- return true;
- }
- gen_io_start();
+ set_can_do_io(db, true);
/*
* Ensure that this instruction will be the last in the TB.
@@ -47,7 +46,7 @@ bool translator_io_start(DisasContextBase *db)
return true;
}
-static TCGOp *gen_tb_start(uint32_t cflags)
+static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
{
TCGv_i32 count = NULL;
TCGOp *icount_start_insn = NULL;
@@ -91,12 +90,9 @@ static TCGOp *gen_tb_start(uint32_t cflags)
* cpu->can_do_io is cleared automatically here at the beginning of
* each translation block. The cost is minimal and only paid for
* -icount, plus it would be very easy to forget doing it in the
- * translator. Doing it here means we don't need a gen_io_end() to
- * go with gen_io_start().
+ * translator.
*/
- tcg_gen_st_i32(tcg_constant_i32(0), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ set_can_do_io(db, false);
}
return icount_start_insn;
@@ -147,6 +143,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
db->num_insns = 0;
db->max_insns = *max_insns;
db->singlestep_enabled = cflags & CF_SINGLE_STEP;
+ db->saved_can_do_io = -1;
db->host_addr[0] = host_pc;
db->host_addr[1] = NULL;
@@ -154,7 +151,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Start translating. */
- icount_start_insn = gen_tb_start(cflags);
+ icount_start_insn = gen_tb_start(db, cflags);
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
@@ -181,7 +178,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
the next instruction. */
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
- gen_io_start();
+ set_can_do_io(db, true);
}
ops->translate_insn(db, cpu);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io()
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
@ 2023-09-15 9:41 ` Philippe Mathieu-Daudé
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
2023-09-15 9:42 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
0 siblings, 2 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:41 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
[PMD: Split patch in 2, extracting set_can_do_io() first]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
accel/tcg/translator.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index b6ab9f3d33..cebf0a84cc 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -16,9 +16,9 @@
#include "tcg/tcg-op-common.h"
#include "internal.h"
-static void gen_io_start(void)
+static void set_can_do_io(DisasContextBase *db, bool val)
{
- tcg_gen_st_i32(tcg_constant_i32(1), cpu_env,
+ tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
offsetof(ArchCPU, parent_obj.can_do_io) -
offsetof(ArchCPU, env));
}
@@ -35,7 +35,7 @@ bool translator_io_start(DisasContextBase *db)
return true;
}
- gen_io_start();
+ set_can_do_io(db, true);
/*
* Ensure that this instruction will be the last in the TB.
@@ -47,7 +47,7 @@ bool translator_io_start(DisasContextBase *db)
return true;
}
-static TCGOp *gen_tb_start(uint32_t cflags)
+static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
{
TCGv_i32 count = NULL;
TCGOp *icount_start_insn = NULL;
@@ -91,12 +91,9 @@ static TCGOp *gen_tb_start(uint32_t cflags)
* cpu->can_do_io is cleared automatically here at the beginning of
* each translation block. The cost is minimal and only paid for
* -icount, plus it would be very easy to forget doing it in the
- * translator. Doing it here means we don't need a gen_io_end() to
- * go with gen_io_start().
+ * translator.
*/
- tcg_gen_st_i32(tcg_constant_i32(0), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ set_can_do_io(db, false);
}
return icount_start_insn;
@@ -154,7 +151,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Start translating. */
- icount_start_insn = gen_tb_start(cflags);
+ icount_start_insn = gen_tb_start(db, cflags);
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
@@ -181,7 +178,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
the next instruction. */
if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
- gen_io_start();
+ set_can_do_io(db, true);
}
ops->translate_insn(db, cpu);
--
2.41.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
@ 2023-09-15 9:41 ` Philippe Mathieu-Daudé
2023-09-15 9:44 ` Philippe Mathieu-Daudé
2023-09-15 9:42 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
1 sibling, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:41 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé
From: Richard Henderson <richard.henderson@linaro.org>
Simplify translator_io_start by recording the current
known value of can_do_io within DisasContextBase.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
[PMD: Split patch in 2, extracting set_can_do_io() first]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
include/exec/translator.h | 2 ++
accel/tcg/translator.c | 14 +++++++-------
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 4e17c4f401..9d9e980819 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -72,6 +72,7 @@ typedef enum DisasJumpType {
* @num_insns: Number of translated instructions (including current).
* @max_insns: Maximum number of instructions to be translated in this TB.
* @singlestep_enabled: "Hardware" single stepping enabled.
+ * @saved_can_do_io: Known value of cpu->neg.can_do_io, or -1 for unknown.
*
* Architecture-agnostic disassembly context.
*/
@@ -83,6 +84,7 @@ typedef struct DisasContextBase {
int num_insns;
int max_insns;
bool singlestep_enabled;
+ int8_t saved_can_do_io;
void *host_addr[2];
} DisasContextBase;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index cebf0a84cc..850d82e26f 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -18,9 +18,12 @@
static void set_can_do_io(DisasContextBase *db, bool val)
{
- tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
- offsetof(ArchCPU, parent_obj.can_do_io) -
- offsetof(ArchCPU, env));
+ if (db->saved_can_do_io != val) {
+ db->saved_can_do_io = val;
+ tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
+ offsetof(ArchCPU, parent_obj.can_do_io) -
+ offsetof(ArchCPU, env));
+ }
}
bool translator_io_start(DisasContextBase *db)
@@ -30,10 +33,6 @@ bool translator_io_start(DisasContextBase *db)
if (!(cflags & CF_USE_ICOUNT)) {
return false;
}
- if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
- /* Already started in translator_loop. */
- return true;
- }
set_can_do_io(db, true);
@@ -144,6 +143,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
db->num_insns = 0;
db->max_insns = *max_insns;
db->singlestep_enabled = cflags & CF_SINGLE_STEP;
+ db->saved_can_do_io = -1;
db->host_addr[0] = host_pc;
db->host_addr[1] = NULL;
--
2.41.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
@ 2023-09-15 9:44 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:44 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson
On 15/9/23 11:41, Philippe Mathieu-Daudé wrote:
> From: Richard Henderson <richard.henderson@linaro.org>
>
> Simplify translator_io_start by recording the current
> known value of can_do_io within DisasContextBase.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
> [PMD: Split patch in 2, extracting set_can_do_io() first]
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> include/exec/translator.h | 2 ++
> accel/tcg/translator.c | 14 +++++++-------
> 2 files changed, 9 insertions(+), 7 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io()
2023-09-15 9:41 ` [PATCH 3/6: 1/2] accel/tcg: Refactor gen_io_start() as set_can_do_io() Philippe Mathieu-Daudé
2023-09-15 9:41 ` [PATCH 3/6: 2/2] accel/tcg: Track current value of can_do_io in the TB Philippe Mathieu-Daudé
@ 2023-09-15 9:42 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-15 9:42 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Richard Henderson
On 15/9/23 11:41, Philippe Mathieu-Daudé wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> Message-ID: <20230914174436.1597356-4-richard.henderson@linaro.org>
> [PMD: Split patch in 2, extracting set_can_do_io() first]
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> accel/tcg/translator.c | 19 ++++++++-----------
> 1 file changed, 8 insertions(+), 11 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
` (2 preceding siblings ...)
2023-09-14 17:44 ` [PATCH 3/6] accel/tcg: Track current value of can_do_io in the TB Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-15 9:47 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ Richard Henderson
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Initialize can_do_io to true if this the TB has CF_LAST_IO
and will consist of a single instruction. This avoids a
set to 0 followed immediately by a set to 1.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 850d82e26f..dd507cd471 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -87,12 +87,12 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
offsetof(ArchCPU, neg.icount_decr.u16.low) -
offsetof(ArchCPU, env));
/*
- * cpu->can_do_io is cleared automatically here at the beginning of
+ * cpu->can_do_io is set automatically here at the beginning of
* each translation block. The cost is minimal and only paid for
* -icount, plus it would be very easy to forget doing it in the
* translator.
*/
- set_can_do_io(db, false);
+ set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
}
return icount_start_insn;
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
` (3 preceding siblings ...)
2023-09-14 17:44 ` [PATCH 4/6] accel/tcg: Improve setting of can_do_io at start of TB Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-19 9:26 ` Philippe Mathieu-Daudé
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
5 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Without this we can get see loops through cpu_io_recompile,
in which the cpu makes no progress.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cpu-exec.c | 2 +-
accel/tcg/tb-maint.c | 6 ++++--
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index e2c494e75e..c724e8b6f1 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -720,7 +720,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
&& cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra == 0) {
/* Execute just one insn to trigger exception pending in the log */
cpu->cflags_next_tb = (curr_cflags(cpu) & ~CF_USE_ICOUNT)
- | CF_NOIRQ | 1;
+ | CF_LAST_IO | CF_NOIRQ | 1;
}
#endif
return false;
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 32ae8af61c..0c3e227409 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -1083,7 +1083,8 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
if (current_tb_modified) {
/* Force execution of one insn next time. */
CPUState *cpu = current_cpu;
- cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
+ cpu->cflags_next_tb =
+ 1 | CF_LAST_IO | CF_NOIRQ | curr_cflags(current_cpu);
return true;
}
return false;
@@ -1153,7 +1154,8 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
if (current_tb_modified) {
page_collection_unlock(pages);
/* Force execution of one insn next time. */
- current_cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(current_cpu);
+ current_cpu->cflags_next_tb =
+ 1 | CF_LAST_IO | CF_NOIRQ | curr_cflags(current_cpu);
mmap_unlock();
cpu_loop_exit_noexc(current_cpu);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/6] accel/tcg: Always require can_do_io
2023-09-14 17:44 [PATCH 0/6] accel/tcg: Always require can_do_io (#1866) Richard Henderson
` (4 preceding siblings ...)
2023-09-14 17:44 ` [PATCH 5/6] accel/tcg: Always set CF_LAST_IO with CF_NOIRQ Richard Henderson
@ 2023-09-14 17:44 ` Richard Henderson
2023-09-19 9:37 ` Philippe Mathieu-Daudé
2023-10-24 9:50 ` Clément Chigot
5 siblings, 2 replies; 19+ messages in thread
From: Richard Henderson @ 2023-09-14 17:44 UTC (permalink / raw)
To: qemu-devel; +Cc: philmd
Require i/o as the last insn of a TranslationBlock always,
not only with icount. This is required for i/o that alters
the address space, such as a pci config space write.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 20 +++++++-------------
target/mips/tcg/translate.c | 1 -
2 files changed, 7 insertions(+), 14 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index dd507cd471..358214d526 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -28,12 +28,6 @@ static void set_can_do_io(DisasContextBase *db, bool val)
bool translator_io_start(DisasContextBase *db)
{
- uint32_t cflags = tb_cflags(db->tb);
-
- if (!(cflags & CF_USE_ICOUNT)) {
- return false;
- }
-
set_can_do_io(db, true);
/*
@@ -86,15 +80,15 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
tcg_gen_st16_i32(count, cpu_env,
offsetof(ArchCPU, neg.icount_decr.u16.low) -
offsetof(ArchCPU, env));
- /*
- * cpu->can_do_io is set automatically here at the beginning of
- * each translation block. The cost is minimal and only paid for
- * -icount, plus it would be very easy to forget doing it in the
- * translator.
- */
- set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
}
+ /*
+ * cpu->can_do_io is set automatically here at the beginning of
+ * each translation block. The cost is minimal, plus it would be
+ * very easy to forget doing it in the translator.
+ */
+ set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
+
return icount_start_insn;
}
diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 9bb40f1849..593fc80458 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -11212,7 +11212,6 @@ static void gen_branch(DisasContext *ctx, int insn_bytes)
/* Branches completion */
clear_branch_hflags(ctx);
ctx->base.is_jmp = DISAS_NORETURN;
- /* FIXME: Need to clear can_do_io. */
switch (proc_hflags & MIPS_HFLAG_BMASK_BASE) {
case MIPS_HFLAG_FBNSLOT:
gen_goto_tb(ctx, 0, ctx->base.pc_next + insn_bytes);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
@ 2023-09-19 9:37 ` Philippe Mathieu-Daudé
2023-10-24 9:50 ` Clément Chigot
1 sibling, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-19 9:37 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
On 14/9/23 19:44, Richard Henderson wrote:
> Require i/o as the last insn of a TranslationBlock always,
> not only with icount. This is required for i/o that alters
> the address space, such as a pci config space write.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 20 +++++++-------------
> target/mips/tcg/translate.c | 1 -
> 2 files changed, 7 insertions(+), 14 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-09-14 17:44 ` [PATCH 6/6] accel/tcg: Always require can_do_io Richard Henderson
2023-09-19 9:37 ` Philippe Mathieu-Daudé
@ 2023-10-24 9:50 ` Clément Chigot
2023-10-26 0:44 ` Richard Henderson
1 sibling, 1 reply; 19+ messages in thread
From: Clément Chigot @ 2023-10-24 9:50 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, philmd
Hi Richard,
This commit has broken some of our internal bareboard testing on
Risc-V 64. At some point in our programs, there is an AMOSWAP (=
atomic swap) instruction on I/O. But since this commit, can_do_io is
set to false triggering an infinite loop.
IIUC the doc (cf [1]), atomic operations on I/O are allowed.
I think there is a CF_LAST_IO flag missing somewhere to allow it, but
I'm not sure where this should be. Do you have any ideas ?
Sadly I cannot provide a reproducer that easily, mainly because our
microchip has a few patches not yet merged making our binaries not
running on the upstream master.
But here is a bit of the in_asm backtrace:
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000eb4: 1141 addi sp,sp,-16
| 0x80000eb6: 0c0027b7 lui a5,49154
| 0x80000eba: e406 sd ra,8(sp)
| 0x80000ebc: 00010597 auipc a1,16
# 0x80010ebc
| 0x80000ec0: 47458593 addi a1,a1,1140
| 0x80000ec4: f3ffe637 lui a2,-49154
| 0x80000ec8: 01878693 addi a3,a5,24
| 0x80000ecc: 00f58733 add a4,a1,a5
| 0x80000ed0: 9732 add a4,a4,a2
| 0x80000ed2: 4318 lw a4,0(a4)
| 0x80000ed4: 2701 sext.w a4,a4
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| 0x80000eda: 0791 addi a5,a5,4
| 0x80000edc: fed798e3 bne a5,a3,-16
# 0x80000ecc
|
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| * Freeze *
a5 (= 0xc002000) above is targeting an address inside sifive_plic HW.
Note IIUC the doc (cf [1]), atomic operations on I/O are allowed.
[1] https://github.com/riscv/riscv-isa-manual/blob/main/src/a-st-ext.adoc#atomic-memory-operations
Thanks in advance.
On Thu, Sep 14, 2023 at 7:45 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Require i/o as the last insn of a TranslationBlock always,
> not only with icount. This is required for i/o that alters
> the address space, such as a pci config space write.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1866
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/translator.c | 20 +++++++-------------
> target/mips/tcg/translate.c | 1 -
> 2 files changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index dd507cd471..358214d526 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -28,12 +28,6 @@ static void set_can_do_io(DisasContextBase *db, bool val)
>
> bool translator_io_start(DisasContextBase *db)
> {
> - uint32_t cflags = tb_cflags(db->tb);
> -
> - if (!(cflags & CF_USE_ICOUNT)) {
> - return false;
> - }
> -
> set_can_do_io(db, true);
>
> /*
> @@ -86,15 +80,15 @@ static TCGOp *gen_tb_start(DisasContextBase *db, uint32_t cflags)
> tcg_gen_st16_i32(count, cpu_env,
> offsetof(ArchCPU, neg.icount_decr.u16.low) -
> offsetof(ArchCPU, env));
> - /*
> - * cpu->can_do_io is set automatically here at the beginning of
> - * each translation block. The cost is minimal and only paid for
> - * -icount, plus it would be very easy to forget doing it in the
> - * translator.
> - */
> - set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
> }
>
> + /*
> + * cpu->can_do_io is set automatically here at the beginning of
> + * each translation block. The cost is minimal, plus it would be
> + * very easy to forget doing it in the translator.
> + */
> + set_can_do_io(db, db->max_insns == 1 && (cflags & CF_LAST_IO));
> +
> return icount_start_insn;
> }
>
> diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
> index 9bb40f1849..593fc80458 100644
> --- a/target/mips/tcg/translate.c
> +++ b/target/mips/tcg/translate.c
> @@ -11212,7 +11212,6 @@ static void gen_branch(DisasContext *ctx, int insn_bytes)
> /* Branches completion */
> clear_branch_hflags(ctx);
> ctx->base.is_jmp = DISAS_NORETURN;
> - /* FIXME: Need to clear can_do_io. */
> switch (proc_hflags & MIPS_HFLAG_BMASK_BASE) {
> case MIPS_HFLAG_FBNSLOT:
> gen_goto_tb(ctx, 0, ctx->base.pc_next + insn_bytes);
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-10-24 9:50 ` Clément Chigot
@ 2023-10-26 0:44 ` Richard Henderson
2023-10-26 8:10 ` Clément Chigot
0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2023-10-26 0:44 UTC (permalink / raw)
To: Clément Chigot; +Cc: qemu-devel, philmd
On 10/24/23 02:50, Clément Chigot wrote:
> Hi Richard,
>
> This commit has broken some of our internal bareboard testing on
> Risc-V 64. At some point in our programs, there is an AMOSWAP (=
> atomic swap) instruction on I/O. But since this commit, can_do_io is
> set to false triggering an infinite loop.
> IIUC the doc (cf [1]), atomic operations on I/O are allowed.
>
> I think there is a CF_LAST_IO flag missing somewhere to allow it, but
> I'm not sure where this should be. Do you have any ideas ?
>
> Sadly I cannot provide a reproducer that easily, mainly because our
> microchip has a few patches not yet merged making our binaries not
> running on the upstream master.
> But here is a bit of the in_asm backtrace:
>
> | IN: system__bb__riscv_plic__initialize
> | Priv: 3; Virt: 0
> | 0x80000eb4: 1141 addi sp,sp,-16
> | 0x80000eb6: 0c0027b7 lui a5,49154
> | 0x80000eba: e406 sd ra,8(sp)
> | 0x80000ebc: 00010597 auipc a1,16
> # 0x80010ebc
> | 0x80000ec0: 47458593 addi a1,a1,1140
> | 0x80000ec4: f3ffe637 lui a2,-49154
> | 0x80000ec8: 01878693 addi a3,a5,24
> | 0x80000ecc: 00f58733 add a4,a1,a5
> | 0x80000ed0: 9732 add a4,a4,a2
> | 0x80000ed2: 4318 lw a4,0(a4)
> | 0x80000ed4: 2701 sext.w a4,a4
> | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> | 0x80000eda: 0791 addi a5,a5,4
> | 0x80000edc: fed798e3 bne a5,a3,-16
> # 0x80000ecc
> |
> | ----------------
> | IN: system__bb__riscv_plic__initialize
> | Priv: 3; Virt: 0
> | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> |
> | ----------------
> | IN: system__bb__riscv_plic__initialize
> | Priv: 3; Virt: 0
> | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> | * Freeze *
I would expect two translations:
(1) with the original TB, aborts execution on !can_do_io.
(2) with the second TB, we get further into the actual execution and abort execution on
TLB_MMIO.
(3) with the third TB, we clear CF_PARALLEL and execute under cpu_exec_step_atomic.
Both 2 and 3 should have had CF_LAST_IO set.
You can verify this with '-d exec' output.
As a trivial example from qemu-system-alpha bios startup:
> Trace 0: 0x7f2584008380 [00000000/fffffc0000003ee4/01000000/ff000000] uart_init_line
> cpu_io_recompile: rewound execution of TB to fffffc0000003ee4
> ----------------
> IN: uart_init_line
> 0xfffffc0000003f20: stb t8,0(t6)
>
> Trace 0: 0x7f2584008a00 [00000000/fffffc0000003f20/01000000/ff018001] uart_init_line
Note that the final "/" field is cflags. The first "Trace" corresponds to (1), where the
store is in the middle of the TB. You can see the io_recompile abort, then the second
"Trace" contains {CF_COUNT=1, CF_LAST_IO, CF_MEMI_ONLY}.
In the short term, try adding CF_LAST_IO to cflags in cpu_exec_step_atomic.
I think probably the logic of CF_LAST_IO should always be applied now, since can_do_io is
always live, and thus the flag itself should go away.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] accel/tcg: Always require can_do_io
2023-10-26 0:44 ` Richard Henderson
@ 2023-10-26 8:10 ` Clément Chigot
0 siblings, 0 replies; 19+ messages in thread
From: Clément Chigot @ 2023-10-26 8:10 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, philmd
On Thu, Oct 26, 2023 at 2:44 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 10/24/23 02:50, Clément Chigot wrote:
> > Hi Richard,
> >
> > This commit has broken some of our internal bareboard testing on
> > Risc-V 64. At some point in our programs, there is an AMOSWAP (=
> > atomic swap) instruction on I/O. But since this commit, can_do_io is
> > set to false triggering an infinite loop.
> > IIUC the doc (cf [1]), atomic operations on I/O are allowed.
> >
> > I think there is a CF_LAST_IO flag missing somewhere to allow it, but
> > I'm not sure where this should be. Do you have any ideas ?
> >
> > Sadly I cannot provide a reproducer that easily, mainly because our
> > microchip has a few patches not yet merged making our binaries not
> > running on the upstream master.
> > But here is a bit of the in_asm backtrace:
> >
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000eb4: 1141 addi sp,sp,-16
> > | 0x80000eb6: 0c0027b7 lui a5,49154
> > | 0x80000eba: e406 sd ra,8(sp)
> > | 0x80000ebc: 00010597 auipc a1,16
> > # 0x80010ebc
> > | 0x80000ec0: 47458593 addi a1,a1,1140
> > | 0x80000ec4: f3ffe637 lui a2,-49154
> > | 0x80000ec8: 01878693 addi a3,a5,24
> > | 0x80000ecc: 00f58733 add a4,a1,a5
> > | 0x80000ed0: 9732 add a4,a4,a2
> > | 0x80000ed2: 4318 lw a4,0(a4)
> > | 0x80000ed4: 2701 sext.w a4,a4
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > | 0x80000eda: 0791 addi a5,a5,4
> > | 0x80000edc: fed798e3 bne a5,a3,-16
> > # 0x80000ecc
> > |
> > | ----------------
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > |
> > | ----------------
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > | * Freeze *
>
> I would expect two translations:
>
> (1) with the original TB, aborts execution on !can_do_io.
> (2) with the second TB, we get further into the actual execution and abort execution on
> TLB_MMIO.
> (3) with the third TB, we clear CF_PARALLEL and execute under cpu_exec_step_atomic.
>
> Both 2 and 3 should have had CF_LAST_IO set.
> You can verify this with '-d exec' output.
>
> As a trivial example from qemu-system-alpha bios startup:
>
> > Trace 0: 0x7f2584008380 [00000000/fffffc0000003ee4/01000000/ff000000] uart_init_line
> > cpu_io_recompile: rewound execution of TB to fffffc0000003ee4
> > ----------------
> > IN: uart_init_line
> > 0xfffffc0000003f20: stb t8,0(t6)
> >
> > Trace 0: 0x7f2584008a00 [00000000/fffffc0000003f20/01000000/ff018001] uart_init_line
>
> Note that the final "/" field is cflags. The first "Trace" corresponds to (1), where the
> store is in the middle of the TB. You can see the io_recompile abort, then the second
> "Trace" contains {CF_COUNT=1, CF_LAST_IO, CF_MEMI_ONLY}.
With the exec, it's indeed a bit clearer.
I do have a new translation made by cpu_io_recompile which sets
CF_LAST_IO (2). But afterward, it somehow loops back to the previous
translation which doesn't have CF_LAST_IO which calls cpu_io_recompile
again, etc. This triggers an infinite loop between these two
translations
| ----------------
| IN: system__bb__riscv_plic__initialize
| 0x80000eb4: 1141 addi sp,sp,-16
| ...
| 0x80000ed4: 2701 sext.w a4,a4
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| 0x80000eda: 0791 addi a5,a5,4
| 0x80000edc: fed798e3 bne a5,a3,-16
# 0x80000ecc
|
| Linking TBs 0x7f0d3199db40 index 0 -> 0x7f0d3199dd00
| Trace 1: 0x7f0d3199dd00
[00000000/0000000080000eb4/0b02401b/ff280000]
system__bb__riscv_plic__initialize
(1) First exec of amoswap without CF_LAST_IO.
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
(2) Second exec with CF_LAST_IO.
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| Trace 1: 0x7f0d3199e140
[00000000/0000000080000ed6/0b02401b/ff298001]
system__bb__riscv_plic__initialize
Loop back (1) and start the infinite loop between (1) and (2).
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
| Trace 1: 0x7f0d3199e140
[00000000/0000000080000ed6/0b02401b/ff298001]
system__bb__riscv_plic__initialize
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
I'll continue to investigate. Even if the workaround you provided
works, there is something weird happening here.
> In the short term, try adding CF_LAST_IO to cflags in cpu_exec_step_atomic.
Thanks for that workaround.
Thanks,
Clément
> I think probably the logic of CF_LAST_IO should always be applied now, since can_do_io is
> always live, and thus the flag itself should go away.
>
>
> r~
^ permalink raw reply [flat|nested] 19+ messages in thread