* [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements
@ 2017-07-07 2:20 Richard Henderson
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco Richard Henderson
` (27 more replies)
0 siblings, 28 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
This fixes two problems with atomic operations on sh4,
including an attempt at supporting the user-space atomics
technique used by most sh-linux-user binaries.
Changes since v1:
* Rebase on Aurelien's recent sh4 patchset.
* Patch 3,5 split out of patch 6.
* Patch 4 fixes the sh4-softmmu problem that Aurelien reported.
* Handle more cases of atomic_fetch_op seen in debian images.
* More cleanups for register banking.
* Fix for 64-bit fp memory operations.
* Tidy illegal instruction checks.
* Implement fpchg (missing from sh4a)
* Implement fsrra (had a nop implementation)
* Use tcg_gen_lookup_and_goto_ptr for simple branches.
Tested with debian unstable bash and our sh-test-0.2.
I do *not* see the crashes that glaubitz reported.
If they're still there I'll need a more complete report.
Full tree is again at
git://github.com/rth7680/qemu.git tgt-sh4
r~
Richard Henderson (27):
target/sh4: Use cmpxchg for movco
target/sh4: Consolidate end-of-TB tests
target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK
target/sh4: Keep env->flags clean
target/sh4: Adjust TB_FLAG_PENDING_MOVCA
target/sh4: Handle user-space atomics
target/sh4: Recognize common gUSA sequences
linux-user/sh4: Notice gUSA regions during signal delivery
linux-user/sh4: Clean env->flags on signal boundaries
target/sh4: Hoist register bank selection
target/sh4: Unify cpu_fregs into FREG
target/sh4: Pass DisasContext to fpr64 routines
target/sh4: Hoist fp register bank selection
target/sh4: Eliminate unused XREG macro
target/sh4: Merge DREG into fpr64 routines
target/sh4: Load/store Dr as 64-bit quantities
target/sh4: Simplify 64-bit fp reg-reg move
target/sh4: Unify code for CHECK_NOT_DELAY_SLOT
target/sh4: Unify code for CHECK_PRIVILEGED
target/sh4: Unify code for CHECK_FPU_ENABLED
target/sh4: Tidy misc illegal insn checks
target/sh4: Introduce CHECK_FPSCR_PR_*
target/sh4: Introduce CHECK_SH4A
target/sh4: Implement fpchg
target/sh4: Add missing FPSCR.PR == 0 checks
target/sh4: Implement fsrra
target/sh4: Use tcg_gen_lookup_and_goto_ptr
target/sh4/cpu.h | 27 +-
target/sh4/helper.h | 2 +
linux-user/signal.c | 26 ++
target/sh4/cpu.c | 2 +-
target/sh4/op_helper.c | 22 ++
target/sh4/translate.c | 946 ++++++++++++++++++++++++++++++++++++-------------
6 files changed, 775 insertions(+), 250 deletions(-)
--
2.9.4
^ permalink raw reply [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-15 23:22 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 02/27] target/sh4: Consolidate end-of-TB tests Richard Henderson
` (26 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
As for other targets, cmpxchg isn't quite right for ll/sc,
suffering from an ABA race, but is sufficient to implement
portable atomic operations.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/cpu.h | 3 ++-
target/sh4/translate.c | 56 +++++++++++++++++++++++++++++++++-----------------
2 files changed, 39 insertions(+), 20 deletions(-)
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index ffb9168..b15116e 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -169,7 +169,8 @@ typedef struct CPUSH4State {
tlb_t itlb[ITLB_SIZE]; /* instruction translation table */
tlb_t utlb[UTLB_SIZE]; /* unified translation table */
- uint32_t ldst;
+ uint32_t lock_addr;
+ uint32_t lock_value;
/* Fields up to this point are cleared by a CPU reset */
struct {} end_reset_fields;
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 4c3512f..82d4d69 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -68,7 +68,8 @@ static TCGv cpu_gregs[24];
static TCGv cpu_sr, cpu_sr_m, cpu_sr_q, cpu_sr_t;
static TCGv cpu_pc, cpu_ssr, cpu_spc, cpu_gbr;
static TCGv cpu_vbr, cpu_sgr, cpu_dbr, cpu_mach, cpu_macl;
-static TCGv cpu_pr, cpu_fpscr, cpu_fpul, cpu_ldst;
+static TCGv cpu_pr, cpu_fpscr, cpu_fpul;
+static TCGv cpu_lock_addr, cpu_lock_value;
static TCGv cpu_fregs[32];
/* internal register indexes */
@@ -151,8 +152,12 @@ void sh4_translate_init(void)
offsetof(CPUSH4State,
delayed_cond),
"_delayed_cond_");
- cpu_ldst = tcg_global_mem_new_i32(cpu_env,
- offsetof(CPUSH4State, ldst), "_ldst_");
+ cpu_lock_addr = tcg_global_mem_new_i32(cpu_env,
+ offsetof(CPUSH4State, lock_addr),
+ "_lock_addr_");
+ cpu_lock_value = tcg_global_mem_new_i32(cpu_env,
+ offsetof(CPUSH4State, lock_value),
+ "_lock_value_");
for (i = 0; i < 32; i++)
cpu_fregs[i] = tcg_global_mem_new_i32(cpu_env,
@@ -1528,20 +1533,32 @@ static void _decode_opc(DisasContext * ctx)
return;
case 0x0073:
/* MOVCO.L
- LDST -> T
+ LDST -> T
If (T == 1) R0 -> (Rn)
0 -> LDST
*/
if (ctx->features & SH_FEATURE_SH4A) {
- TCGLabel *label = gen_new_label();
- tcg_gen_mov_i32(cpu_sr_t, cpu_ldst);
- tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ldst, 0, label);
- tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx, MO_TEUL);
- gen_set_label(label);
- tcg_gen_movi_i32(cpu_ldst, 0);
- return;
- } else
- break;
+ TCGLabel *fail = gen_new_label();
+ TCGLabel *done = gen_new_label();
+ TCGv tmp;
+
+ tcg_gen_brcond_i32(TCG_COND_NE, REG(B11_8), cpu_lock_addr, fail);
+
+ tmp = tcg_temp_new();
+ tcg_gen_atomic_cmpxchg_i32(tmp, REG(B11_8), cpu_lock_value,
+ REG(0), ctx->memidx, MO_TEUL);
+ tcg_gen_setcond_i32(TCG_COND_EQ, cpu_sr_t, tmp, cpu_lock_value);
+ tcg_temp_free(tmp);
+ tcg_gen_br(done);
+
+ gen_set_label(fail);
+ tcg_gen_movi_i32(cpu_sr_t, 0);
+
+ gen_set_label(done);
+ return;
+ } else {
+ break;
+ }
case 0x0063:
/* MOVLI.L @Rm,R0
1 -> LDST
@@ -1549,13 +1566,14 @@ static void _decode_opc(DisasContext * ctx)
When interrupt/exception
occurred 0 -> LDST
*/
- if (ctx->features & SH_FEATURE_SH4A) {
- tcg_gen_movi_i32(cpu_ldst, 0);
+ if (ctx->features & SH_FEATURE_SH4A) {
tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
- tcg_gen_movi_i32(cpu_ldst, 1);
- return;
- } else
- break;
+ tcg_gen_mov_i32(cpu_lock_addr, REG(B11_8));
+ tcg_gen_mov_i32(cpu_lock_value, REG(0));
+ return;
+ } else {
+ break;
+ }
case 0x0093: /* ocbi @Rn */
{
gen_helper_ocbi(cpu_env, REG(B11_8));
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 02/27] target/sh4: Consolidate end-of-TB tests
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK Richard Henderson
` (25 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We can fold 3 different tests within the decode loop
into a more accurate computation of max_insns to start.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 82d4d69..663b5c0 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -1848,7 +1848,6 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
ctx.features = env->features;
ctx.has_movcal = (ctx.tbflags & TB_FLAG_PENDING_MOVCA);
- num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
@@ -1856,9 +1855,23 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
if (max_insns > TCG_MAX_INSNS) {
max_insns = TCG_MAX_INSNS;
}
+ /* Since the ISA is fixed-width, we can bound by the number
+ of instructions remaining on the page. */
+ num_insns = (TARGET_PAGE_SIZE - (ctx.pc & (TARGET_PAGE_SIZE - 1))) / 2;
+ if (max_insns > num_insns) {
+ max_insns = num_insns;
+ }
+ /* Single stepping means just that. */
+ if (ctx.singlestep_enabled || singlestep) {
+ max_insns = 1;
+ }
gen_tb_start(tb);
- while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
+ num_insns = 0;
+
+ while (ctx.bstate == BS_NONE
+ && num_insns < max_insns
+ && !tcg_op_buf_full()) {
tcg_gen_insn_start(ctx.pc, ctx.envflags);
num_insns++;
@@ -1882,18 +1895,10 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
ctx.opcode = cpu_lduw_code(env, ctx.pc);
decode_opc(&ctx);
ctx.pc += 2;
- if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0)
- break;
- if (cs->singlestep_enabled) {
- break;
- }
- if (num_insns >= max_insns)
- break;
- if (singlestep)
- break;
}
- if (tb->cflags & CF_LAST_IO)
+ if (tb->cflags & CF_LAST_IO) {
gen_io_end();
+ }
if (cs->singlestep_enabled) {
gen_save_cpu_state(&ctx, true);
gen_helper_debug(cpu_env);
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco Richard Henderson
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 02/27] target/sh4: Consolidate end-of-TB tests Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:29 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean Richard Henderson
` (24 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We'll be putting more things into this bitmask soon.
Let's have a name that covers all possible uses.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/cpu.h | 4 +++-
target/sh4/translate.c | 4 ++--
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index b15116e..240ed36 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -96,6 +96,8 @@
#define DELAY_SLOT_CONDITIONAL (1 << 1)
#define DELAY_SLOT_RTE (1 << 2)
+#define TB_FLAG_ENVFLAGS_MASK DELAY_SLOT_MASK
+
typedef struct tlb_t {
uint32_t vpn; /* virtual page number */
uint32_t ppn; /* physical page number */
@@ -389,7 +391,7 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
{
*pc = env->pc;
*cs_base = 0;
- *flags = (env->flags & DELAY_SLOT_MASK) /* Bits 0- 2 */
+ *flags = (env->flags & TB_FLAG_ENVFLAGS_MASK) /* Bits 0-2 */
| (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
| (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
| (env->sr & (1u << SR_FD)) /* Bit 15 */
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 663b5c0..cf53cd6 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -225,7 +225,7 @@ static inline void gen_save_cpu_state(DisasContext *ctx, bool save_pc)
if (ctx->delayed_pc != (uint32_t) -1) {
tcg_gen_movi_i32(cpu_delayed_pc, ctx->delayed_pc);
}
- if ((ctx->tbflags & DELAY_SLOT_MASK) != ctx->envflags) {
+ if ((ctx->tbflags & TB_FLAG_ENVFLAGS_MASK) != ctx->envflags) {
tcg_gen_movi_i32(cpu_flags, ctx->envflags);
}
}
@@ -1837,7 +1837,7 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
pc_start = tb->pc;
ctx.pc = pc_start;
ctx.tbflags = (uint32_t)tb->flags;
- ctx.envflags = tb->flags & DELAY_SLOT_MASK;
+ ctx.envflags = tb->flags & TB_FLAG_ENVFLAGS_MASK;
ctx.bstate = BS_NONE;
ctx.memidx = (ctx.tbflags & (1u << SR_MD)) == 0 ? 1 : 0;
/* We don't know if the delayed pc came from a dynamic or static branch,
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (2 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:31 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA Richard Henderson
` (23 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
If we mask off any out-of-band bits before we assign to the
variable, then we don't need to clean it up when reading.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/cpu.h | 2 +-
target/sh4/cpu.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index 240ed36..6d179a7 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -391,7 +391,7 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
{
*pc = env->pc;
*cs_base = 0;
- *flags = (env->flags & TB_FLAG_ENVFLAGS_MASK) /* Bits 0-2 */
+ *flags = env->flags /* Bits 0-2 */
| (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
| (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
| (env->sr & (1u << SR_FD)) /* Bit 15 */
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
index 9da7e1e..8536f6d 100644
--- a/target/sh4/cpu.c
+++ b/target/sh4/cpu.c
@@ -39,7 +39,7 @@ static void superh_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
SuperHCPU *cpu = SUPERH_CPU(cs);
cpu->env.pc = tb->pc;
- cpu->env.flags = tb->flags;
+ cpu->env.flags = tb->flags & TB_FLAG_ENVFLAGS_MASK;
}
static bool superh_cpu_has_work(CPUState *cs)
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (3 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:31 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics Richard Henderson
` (22 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Don't leave an unused bit after DELAY_SLOT_MASK.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/cpu.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index 6d179a7..da31805 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -96,6 +96,8 @@
#define DELAY_SLOT_CONDITIONAL (1 << 1)
#define DELAY_SLOT_RTE (1 << 2)
+#define TB_FLAG_PENDING_MOVCA (1 << 3)
+
#define TB_FLAG_ENVFLAGS_MASK DELAY_SLOT_MASK
typedef struct tlb_t {
@@ -369,8 +371,6 @@ static inline int cpu_ptel_pr (uint32_t ptel)
#define PTEA_TC (1 << 3)
#define cpu_ptea_tc(ptea) (((ptea) & PTEA_TC) >> 3)
-#define TB_FLAG_PENDING_MOVCA (1 << 4)
-
static inline target_ulong cpu_read_sr(CPUSH4State *env)
{
return env->sr | (env->sr_m << SR_M) |
@@ -395,7 +395,7 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
| (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
| (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
| (env->sr & (1u << SR_FD)) /* Bit 15 */
- | (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 4 */
+ | (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 3 */
}
#endif /* SH4_CPU_H */
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (4 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-15 22:14 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences Richard Henderson
` (21 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
For uniprocessors, SH4 uses optimistic restartable atomic sequences.
Upon an interrupt, a real kernel would simply notice magic values in
the registers and reset the PC to the start of the sequence.
For QEMU, we cannot do this in quite the same way. Instead, we notice
the normal start of such a sequence (mov #-x,r15), and start a new TB
that can be executed under cpu_exec_step_atomic.
Reported-by: Bruno Haible <bruno@clisp.org>
LP: https://bugs.launchpad.net/bugs/1701971
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/cpu.h | 18 +++++--
target/sh4/helper.h | 1 +
target/sh4/op_helper.c | 6 +++
target/sh4/translate.c | 131 +++++++++++++++++++++++++++++++++++++++++++++++--
4 files changed, 148 insertions(+), 8 deletions(-)
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index da31805..e3abb6a 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -98,7 +98,18 @@
#define TB_FLAG_PENDING_MOVCA (1 << 3)
-#define TB_FLAG_ENVFLAGS_MASK DELAY_SLOT_MASK
+#define GUSA_SHIFT 4
+#ifdef CONFIG_USER_ONLY
+#define GUSA_EXCLUSIVE (1 << 12)
+#define GUSA_MASK ((0xff << GUSA_SHIFT) | GUSA_EXCLUSIVE)
+#else
+/* Provide dummy versions of the above to allow tests against tbflags
+ to be elided while avoiding ifdefs. */
+#define GUSA_EXCLUSIVE 0
+#define GUSA_MASK 0
+#endif
+
+#define TB_FLAG_ENVFLAGS_MASK (DELAY_SLOT_MASK | GUSA_MASK)
typedef struct tlb_t {
uint32_t vpn; /* virtual page number */
@@ -390,8 +401,9 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
target_ulong *cs_base, uint32_t *flags)
{
*pc = env->pc;
- *cs_base = 0;
- *flags = env->flags /* Bits 0-2 */
+ /* For a gUSA region, notice the end of the region. */
+ *cs_base = env->flags & GUSA_MASK ? env->gregs[0] : 0;
+ *flags = env->flags /* TB_FLAG_ENVFLAGS_MASK: bits 0-2, 4-12 */
| (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
| (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
| (env->sr & (1u << SR_FD)) /* Bit 15 */
diff --git a/target/sh4/helper.h b/target/sh4/helper.h
index 767a6d5..6c6fa04 100644
--- a/target/sh4/helper.h
+++ b/target/sh4/helper.h
@@ -6,6 +6,7 @@ DEF_HELPER_1(raise_slot_fpu_disable, noreturn, env)
DEF_HELPER_1(debug, noreturn, env)
DEF_HELPER_1(sleep, noreturn, env)
DEF_HELPER_2(trapa, noreturn, env, i32)
+DEF_HELPER_1(exclusive, noreturn, env)
DEF_HELPER_3(movcal, void, env, i32, i32)
DEF_HELPER_1(discard_movcal_backup, void, env)
diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index c3d19b1..8513f38 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -115,6 +115,12 @@ void helper_trapa(CPUSH4State *env, uint32_t tra)
raise_exception(env, 0x160, 0);
}
+void helper_exclusive(CPUSH4State *env)
+{
+ /* We do not want cpu_restore_state to run. */
+ cpu_loop_exit_atomic(ENV_GET_CPU(env), 0);
+}
+
void helper_movcal(CPUSH4State *env, uint32_t address, uint32_t value)
{
if (cpu_sh4_is_cached (env, address))
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index cf53cd6..653c06c 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -235,7 +235,9 @@ static inline bool use_goto_tb(DisasContext *ctx, target_ulong dest)
if (unlikely(ctx->singlestep_enabled)) {
return false;
}
-
+ if (ctx->tbflags & GUSA_EXCLUSIVE) {
+ return false;
+ }
#ifndef CONFIG_USER_ONLY
return (ctx->tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK);
#else
@@ -278,6 +280,28 @@ static void gen_conditional_jump(DisasContext * ctx,
target_ulong ift, target_ulong ifnott)
{
TCGLabel *l1 = gen_new_label();
+
+ if (ctx->tbflags & GUSA_EXCLUSIVE) {
+ /* When in an exclusive region, we must continue to the end.
+ Therefore, exit the region on a taken branch, but otherwise
+ fall through to the next instruction. */
+ uint32_t taken;
+ TCGCond cond;
+
+ if (ift == ctx->pc + 2) {
+ taken = ifnott;
+ cond = TCG_COND_NE;
+ } else {
+ taken = ift;
+ cond = TCG_COND_EQ;
+ }
+ tcg_gen_brcondi_i32(cond, cpu_sr_t, 0, l1);
+ tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~GUSA_MASK);
+ gen_goto_tb(ctx, 0, taken);
+ gen_set_label(l1);
+ return;
+ }
+
gen_save_cpu_state(ctx, false);
tcg_gen_brcondi_i32(TCG_COND_NE, cpu_sr_t, 0, l1);
gen_goto_tb(ctx, 0, ifnott);
@@ -289,13 +313,26 @@ static void gen_conditional_jump(DisasContext * ctx,
/* Delayed conditional jump (bt or bf) */
static void gen_delayed_conditional_jump(DisasContext * ctx)
{
- TCGLabel *l1;
- TCGv ds;
+ TCGLabel *l1 = gen_new_label();
+ TCGv ds = tcg_temp_new();
- l1 = gen_new_label();
- ds = tcg_temp_new();
tcg_gen_mov_i32(ds, cpu_delayed_cond);
tcg_gen_discard_i32(cpu_delayed_cond);
+
+ if (ctx->tbflags & GUSA_EXCLUSIVE) {
+ /* When in an exclusive region, we must continue to the end.
+ Therefore, exit the region on a taken branch, but otherwise
+ fall through to the next instruction. */
+ tcg_gen_brcondi_i32(TCG_COND_EQ, ds, 0, l1);
+
+ /* Leave the gUSA region. */
+ tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~GUSA_MASK);
+ gen_jump(ctx);
+
+ gen_set_label(l1);
+ return;
+ }
+
tcg_gen_brcondi_i32(TCG_COND_NE, ds, 0, l1);
gen_goto_tb(ctx, 1, ctx->pc + 2);
gen_set_label(l1);
@@ -480,6 +517,15 @@ static void _decode_opc(DisasContext * ctx)
}
return;
case 0xe000: /* mov #imm,Rn */
+#ifdef CONFIG_USER_ONLY
+ /* Detect the start of a gUSA region. If so, update envflags
+ and end the TB. This will allow us to see the end of the
+ region (stored in R0) in the next TB. */
+ if (B11_8 == 15 && B7_0s < 0) {
+ ctx->envflags = deposit32(ctx->envflags, GUSA_SHIFT, 8, B7_0s);
+ ctx->bstate = BS_STOP;
+ }
+#endif
tcg_gen_movi_i32(REG(B11_8), B7_0s);
return;
case 0x9000: /* mov.w @(disp,PC),Rn */
@@ -1814,6 +1860,18 @@ static void decode_opc(DisasContext * ctx)
if (old_flags & DELAY_SLOT_MASK) {
/* go out of the delay slot */
ctx->envflags &= ~DELAY_SLOT_MASK;
+
+ /* When in an exclusive region, we must continue to the end
+ for conditional branches. */
+ if (ctx->tbflags & GUSA_EXCLUSIVE
+ && old_flags & DELAY_SLOT_CONDITIONAL) {
+ gen_delayed_conditional_jump(ctx);
+ return;
+ }
+ /* Otherwise this is probably an invalid gUSA region.
+ Drop the GUSA bits so the next TB doesn't see them. */
+ ctx->envflags &= ~GUSA_MASK;
+
tcg_gen_movi_i32(cpu_flags, ctx->envflags);
ctx->bstate = BS_BRANCH;
if (old_flags & DELAY_SLOT_CONDITIONAL) {
@@ -1821,9 +1879,60 @@ static void decode_opc(DisasContext * ctx)
} else {
gen_jump(ctx);
}
+ }
+}
+#ifdef CONFIG_USER_ONLY
+/* For uniprocessors, SH4 uses optimistic restartable atomic sequences.
+ Upon an interrupt, a real kernel would simply notice magic values in
+ the registers and reset the PC to the start of the sequence.
+
+ For QEMU, we cannot do this in quite the same way. Instead, we notice
+ the normal start of such a sequence (mov #-x,r15). While we can handle
+ any sequence via cpu_exec_step_atomic, we can recognize the "normal"
+ sequences and transform them into atomic operations as seen by the host.
+*/
+static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insns)
+{
+ uint32_t pc = ctx->pc;
+ uint32_t pc_end = ctx->tb->cs_base;
+ int backup = sextract32(ctx->tbflags, GUSA_SHIFT, 8);
+ int max_insns = (pc_end - pc) / 2;
+
+ if (pc != pc_end + backup || max_insns < 2) {
+ /* This is a malformed gUSA region. Don't do anything special,
+ since the interpreter is likely to get confused. */
+ ctx->envflags &= ~GUSA_MASK;
+ return 0;
+ }
+
+ if (ctx->tbflags & GUSA_EXCLUSIVE) {
+ /* Regardless of single-stepping or the end of the page,
+ we must complete execution of the gUSA region while
+ holding the exclusive lock. */
+ *pmax_insns = max_insns;
+ return 0;
}
+
+ qemu_log_mask(LOG_UNIMP, "Unrecognized gUSA sequence %08x-%08x\n",
+ pc, pc_end);
+
+ /* Restart with the EXCLUSIVE bit set, within a TB run via
+ cpu_exec_step_atomic holding the exclusive lock. */
+ tcg_gen_insn_start(pc, ctx->envflags);
+ ctx->envflags |= GUSA_EXCLUSIVE;
+ gen_save_cpu_state(ctx, false);
+ gen_helper_exclusive(cpu_env);
+ ctx->bstate = BS_EXCP;
+
+ /* We're not executing an instruction, but we must report one for the
+ purposes of accounting within the TB. We might as well report the
+ entire region consumed via ctx->pc so that it's immediately available
+ in the disassembly dump. */
+ ctx->pc = pc_end;
+ return 1;
}
+#endif
void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
{
@@ -1869,6 +1978,12 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
gen_tb_start(tb);
num_insns = 0;
+#ifdef CONFIG_USER_ONLY
+ if (ctx.tbflags & GUSA_MASK) {
+ num_insns = decode_gusa(&ctx, env, &max_insns);
+ }
+#endif
+
while (ctx.bstate == BS_NONE
&& num_insns < max_insns
&& !tcg_op_buf_full()) {
@@ -1899,6 +2014,12 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
if (tb->cflags & CF_LAST_IO) {
gen_io_end();
}
+
+ if (ctx.tbflags & GUSA_EXCLUSIVE) {
+ /* Ending the region of exclusivity. Clear the bits. */
+ ctx.envflags &= ~GUSA_MASK;
+ }
+
if (cs->singlestep_enabled) {
gen_save_cpu_state(&ctx, true);
gen_helper_debug(cpu_env);
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (5 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-17 14:10 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery Richard Henderson
` (20 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
For many of the sequences produced by gcc or glibc,
we can translate these as host atomic operations.
Which saves the need to acquire the exclusive lock.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 316 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 316 insertions(+)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 653c06c..73b3e02 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -1894,10 +1894,17 @@ static void decode_opc(DisasContext * ctx)
*/
static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insns)
{
+ uint16_t insns[5];
+ int ld_adr, ld_dst, ld_mop;
+ int op_dst, op_src, op_opc;
+ int mv_src, mt_dst, st_src, st_mop;
+ TCGv op_arg;
+
uint32_t pc = ctx->pc;
uint32_t pc_end = ctx->tb->cs_base;
int backup = sextract32(ctx->tbflags, GUSA_SHIFT, 8);
int max_insns = (pc_end - pc) / 2;
+ int i;
if (pc != pc_end + backup || max_insns < 2) {
/* This is a malformed gUSA region. Don't do anything special,
@@ -1914,6 +1921,315 @@ static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insns)
return 0;
}
+ /* The state machine below will consume only a few insns.
+ If there are more than that in a region, fail now. */
+ if (max_insns > ARRAY_SIZE(insns)) {
+ goto fail;
+ }
+
+ /* Read all of the insns for the region. */
+ for (i = 0; i < max_insns; ++i) {
+ insns[i] = cpu_lduw_code(env, pc + i * 2);
+ }
+
+ ld_adr = ld_dst = ld_mop = -1;
+ mv_src = -1;
+ op_dst = op_src = op_opc = -1;
+ mt_dst = -1;
+ st_src = st_mop = -1;
+ TCGV_UNUSED(op_arg);
+ i = 0;
+
+#define NEXT_INSN \
+ do { if (i >= max_insns) goto fail; ctx->opcode = insns[i++]; } while (0)
+
+ /*
+ * Expect a load to begin the region.
+ */
+ NEXT_INSN;
+ switch (ctx->opcode & 0xf00f) {
+ case 0x6000: /* mov.b @Rm,Rn */
+ ld_mop = MO_SB;
+ break;
+ case 0x6001: /* mov.w @Rm,Rn */
+ ld_mop = MO_TESW;
+ break;
+ case 0x6002: /* mov.l @Rm,Rn */
+ ld_mop = MO_TESL;
+ break;
+ default:
+ goto fail;
+ }
+ ld_adr = B7_4;
+ ld_dst = B11_8;
+ if (ld_adr == ld_dst) {
+ goto fail;
+ }
+ /* Unless we see a mov, any two-operand operation must use ld_dst. */
+ op_dst = ld_dst;
+
+ /*
+ * Expect an optional register move.
+ */
+ NEXT_INSN;
+ switch (ctx->opcode & 0xf00f) {
+ case 0x6003: /* mov Rm,Rn */
+ /* Here we want to recognize ld_dst being saved for later consumtion,
+ or for another input register being copied so that ld_dst need not
+ be clobbered during the operation. */
+ op_dst = B11_8;
+ mv_src = B7_4;
+ if (op_dst == ld_dst) {
+ /* Overwriting the load output. */
+ goto fail;
+ }
+ if (mv_src != ld_dst) {
+ /* Copying a new input; constrain op_src to match the load. */
+ op_src = ld_dst;
+ }
+ break;
+
+ default:
+ /* Put back and re-examine as operation. */
+ --i;
+ }
+
+ /*
+ * Expect the operation.
+ */
+ NEXT_INSN;
+ switch (ctx->opcode & 0xf00f) {
+ case 0x300c: /* add Rm,Rn */
+ op_opc = INDEX_op_add_i32;
+ goto do_reg_op;
+ case 0x2009: /* and Rm,Rn */
+ op_opc = INDEX_op_and_i32;
+ goto do_reg_op;
+ case 0x200a: /* xor Rm,Rn */
+ op_opc = INDEX_op_xor_i32;
+ goto do_reg_op;
+ case 0x200b: /* or Rm,Rn */
+ op_opc = INDEX_op_or_i32;
+ do_reg_op:
+ /* The operation register should be as expected, and the
+ other input cannot depend on the load. */
+ if (op_dst != B11_8) {
+ goto fail;
+ }
+ if (op_src < 0) {
+ /* Unconstrainted input. */
+ op_src = B7_4;
+ } else if (op_src == B7_4) {
+ /* Constrained input matched load. All operations are
+ commutative; "swap" them by "moving" the load output
+ to the (implicit) first argument and the move source
+ to the (explicit) second argument. */
+ op_src = mv_src;
+ } else {
+ goto fail;
+ }
+ op_arg = REG(op_src);
+ break;
+
+ case 0x6007: /* not Rm,Rn */
+ if (ld_dst != B7_4 || mv_src >= 0) {
+ goto fail;
+ }
+ op_dst = B11_8;
+ op_opc = INDEX_op_xor_i32;
+ op_arg = tcg_const_i32(-1);
+ break;
+
+ case 0x7000 ... 0x700f: /* add #imm,Rn */
+ if (op_dst != B11_8 || op_src >= 0) {
+ goto fail;
+ }
+ op_opc = INDEX_op_add_i32;
+ op_arg = tcg_const_i32(B7_0s);
+ break;
+
+ case 0x3000: /* cmp/eq Rm,Rn */
+ /* Looking for the middle of a compare-and-swap sequence,
+ beginning with the compare. Operands can be either order,
+ but with only one overlapping the load. */
+ if ((ld_dst == B11_8) + (ld_dst == B7_4) != 1 || mv_src >= 0) {
+ goto fail;
+ }
+ op_opc = INDEX_op_setcond_i32; /* placeholder */
+ op_src = (ld_dst == B11_8 ? B7_4 : B11_8);
+ op_arg = REG(op_src);
+
+ NEXT_INSN;
+ switch (ctx->opcode & 0xff00) {
+ case 0x8b00: /* bf label */
+ case 0x8f00: /* bf/s label */
+ if (pc + (i + 1 + B7_0s) * 2 != pc_end) {
+ goto fail;
+ }
+ if ((ctx->opcode & 0xff00) == 0x8b00) { /* bf label */
+ break;
+ }
+ /* We're looking to unconditionally modify Rn with the
+ result of the comparison, within the delay slot of
+ the branch. This is used by older gcc. */
+ NEXT_INSN;
+ if ((ctx->opcode & 0xf0ff) == 0x0029) { /* movt Rn */
+ mt_dst = B11_8;
+ } else {
+ goto fail;
+ }
+ break;
+
+ default:
+ goto fail;
+ }
+ break;
+
+ case 0x2008: /* tst Rm,Rn */
+ /* Looking for a compare-and-swap against zero. */
+ if (ld_dst != B11_8 || ld_dst != B7_4 || mv_src >= 0) {
+ goto fail;
+ }
+ op_opc = INDEX_op_setcond_i32;
+ op_arg = tcg_const_i32(0);
+
+ NEXT_INSN;
+ if ((ctx->opcode & 0xff00) != 0x8900 /* bt label */
+ || pc + (i + 1 + B7_0s) * 2 != pc_end) {
+ goto fail;
+ }
+ break;
+
+ default:
+ /* Put back and re-examine as store. */
+ --i;
+ }
+
+ /*
+ * Expect the store.
+ */
+ /* The store must be the last insn. */
+ if (i != max_insns - 1) {
+ goto fail;
+ }
+ NEXT_INSN;
+ switch (ctx->opcode & 0xf00f) {
+ case 0x2000: /* mov.b Rm,@Rn */
+ st_mop = MO_UB;
+ break;
+ case 0x2001: /* mov.w Rm,@Rn */
+ st_mop = MO_UW;
+ break;
+ case 0x2002: /* mov.l Rm,@Rn */
+ st_mop = MO_UL;
+ break;
+ default:
+ goto fail;
+ }
+ /* The store must match the load. */
+ if (ld_adr != B11_8 || st_mop != (ld_mop & MO_SIZE)) {
+ goto fail;
+ }
+ st_src = B7_4;
+
+#undef NEXT_INSN
+
+ /*
+ * Emit the operation.
+ */
+ tcg_gen_insn_start(pc, ctx->envflags);
+ switch (op_opc) {
+ case -1:
+ /* No operation found. Look for exchange pattern. */
+ if (st_src == ld_dst || mv_src >= 0) {
+ goto fail;
+ }
+ tcg_gen_atomic_xchg_i32(REG(ld_dst), REG(ld_adr), REG(st_src),
+ ctx->memidx, ld_mop);
+ break;
+
+ case INDEX_op_add_i32:
+ if (op_dst != st_src) {
+ goto fail;
+ }
+ if (op_dst == ld_dst && st_mop == MO_UL) {
+ tcg_gen_atomic_add_fetch_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ } else {
+ tcg_gen_atomic_fetch_add_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ if (op_dst != ld_dst) {
+ /* Note that mop sizes < 4 cannot use add_fetch
+ because it won't carry into the higher bits. */
+ tcg_gen_add_i32(REG(op_dst), REG(ld_dst), op_arg);
+ }
+ }
+ break;
+
+ case INDEX_op_and_i32:
+ if (op_dst != st_src) {
+ goto fail;
+ }
+ if (op_dst == ld_dst) {
+ tcg_gen_atomic_and_fetch_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ } else {
+ tcg_gen_atomic_fetch_and_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ tcg_gen_and_i32(REG(op_dst), REG(ld_dst), op_arg);
+ }
+ break;
+
+ case INDEX_op_or_i32:
+ if (op_dst != st_src) {
+ goto fail;
+ }
+ if (op_dst == ld_dst) {
+ tcg_gen_atomic_or_fetch_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ } else {
+ tcg_gen_atomic_fetch_or_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ tcg_gen_or_i32(REG(op_dst), REG(ld_dst), op_arg);
+ }
+ break;
+
+ case INDEX_op_xor_i32:
+ if (op_dst != st_src) {
+ goto fail;
+ }
+ if (op_dst == ld_dst) {
+ tcg_gen_atomic_xor_fetch_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ } else {
+ tcg_gen_atomic_fetch_xor_i32(REG(ld_dst), REG(ld_adr),
+ op_arg, ctx->memidx, ld_mop);
+ tcg_gen_xor_i32(REG(op_dst), REG(ld_dst), op_arg);
+ }
+ break;
+
+ case INDEX_op_setcond_i32:
+ if (st_src == ld_dst) {
+ goto fail;
+ }
+ tcg_gen_atomic_cmpxchg_i32(REG(ld_dst), REG(ld_adr), op_arg,
+ REG(st_src), ctx->memidx, ld_mop);
+ tcg_gen_setcond_i32(TCG_COND_EQ, cpu_sr_t, REG(ld_dst), op_arg);
+ if (mt_dst >= 0) {
+ tcg_gen_mov_i32(REG(mt_dst), cpu_sr_t);
+ }
+ break;
+
+ default:
+ g_assert_not_reached();
+ }
+
+ /* The entire region has been translated. */
+ ctx->envflags &= ~GUSA_MASK;
+ ctx->pc = pc_end;
+ return max_insns;
+
+ fail:
qemu_log_mask(LOG_UNIMP, "Unrecognized gUSA sequence %08x-%08x\n",
pc, pc_end);
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (6 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 7:25 ` John Paul Adrian Glaubitz
` (2 more replies)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries Richard Henderson
` (19 subsequent siblings)
27 siblings, 3 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We translate gUSA regions atomically in a parallel context.
But in a serial context a gUSA region may be interrupted.
In that case, restart the region as the kernel would.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
linux-user/signal.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 3d18d1b..a537778 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -3471,6 +3471,25 @@ static abi_ulong get_sigframe(struct target_sigaction *ka,
return (sp - frame_size) & -8ul;
}
+/* Notice when we're in the middle of a gUSA region and reset.
+ Note that this will only occur for !parallel_cpus, as we will
+ translate such sequences differently in a parallel context. */
+static void unwind_gusa(CPUSH4State *regs)
+{
+ /* If the stack pointer is sufficiently negative ... */
+ if ((regs->gregs[15] & 0xc0000000u) == 0xc0000000u
+ /* ... and we haven't completed the sequence ... */
+ && regs->pc < regs->gregs[0]) {
+ /* Reset the PC to before the gUSA region, as computed from
+ R0 = region end, SP = -(region size), plus one more insn
+ that actually sets SP to the region size. */
+ regs->pc = regs->gregs[0] + regs->gregs[15] - 2;
+
+ /* Reset the SP to the saved version in R1. */
+ regs->gregs[15] = regs->gregs[1];
+ }
+}
+
static void setup_sigcontext(struct target_sigcontext *sc,
CPUSH4State *regs, unsigned long mask)
{
@@ -3534,6 +3553,8 @@ static void setup_frame(int sig, struct target_sigaction *ka,
abi_ulong frame_addr;
int i;
+ unwind_gusa(regs);
+
frame_addr = get_sigframe(ka, regs->gregs[15], sizeof(*frame));
trace_user_setup_frame(regs, frame_addr);
if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
@@ -3583,6 +3604,8 @@ static void setup_rt_frame(int sig, struct target_sigaction *ka,
abi_ulong frame_addr;
int i;
+ unwind_gusa(regs);
+
frame_addr = get_sigframe(ka, regs->gregs[15], sizeof(*frame));
trace_user_setup_rt_frame(regs, frame_addr);
if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (7 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-15 22:59 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 10/27] target/sh4: Hoist register bank selection Richard Henderson
` (18 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
If a signal is delivered during the execution of a delay slot,
or a gUSA region, clear those bits from the environment so that
the signal handler does not start in that same state.
Cleaning the bits on signal return is paranoid good sense.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
linux-user/signal.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/linux-user/signal.c b/linux-user/signal.c
index a537778..8c0b851 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -3544,6 +3544,7 @@ static void restore_sigcontext(CPUSH4State *regs, struct target_sigcontext *sc)
__get_user(regs->fpul, &sc->sc_fpul);
regs->tra = -1; /* disable syscall checks */
+ regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
}
static void setup_frame(int sig, struct target_sigaction *ka,
@@ -3587,6 +3588,7 @@ static void setup_frame(int sig, struct target_sigaction *ka,
regs->gregs[5] = 0;
regs->gregs[6] = frame_addr += offsetof(typeof(*frame), sc);
regs->pc = (unsigned long) ka->_sa_handler;
+ regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
unlock_user_struct(frame, frame_addr, 1);
return;
@@ -3649,6 +3651,7 @@ static void setup_rt_frame(int sig, struct target_sigaction *ka,
regs->gregs[5] = frame_addr + offsetof(typeof(*frame), info);
regs->gregs[6] = frame_addr + offsetof(typeof(*frame), uc);
regs->pc = (unsigned long) ka->_sa_handler;
+ regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
unlock_user_struct(frame, frame_addr, 1);
return;
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 10/27] target/sh4: Hoist register bank selection
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (8 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:48 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG Richard Henderson
` (17 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Compute which register bank to use once at the start of translation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 73b3e02..0ac101e 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -41,6 +41,7 @@ typedef struct DisasContext {
uint32_t envflags; /* should stay in sync with env->flags using TCG ops */
int bstate;
int memidx;
+ int gbank;
uint32_t delayed_pc;
int singlestep_enabled;
uint32_t features;
@@ -64,7 +65,7 @@ enum {
/* global register indexes */
static TCGv_env cpu_env;
-static TCGv cpu_gregs[24];
+static TCGv cpu_gregs[32];
static TCGv cpu_sr, cpu_sr_m, cpu_sr_q, cpu_sr_t;
static TCGv cpu_pc, cpu_ssr, cpu_spc, cpu_gbr;
static TCGv cpu_vbr, cpu_sgr, cpu_dbr, cpu_mach, cpu_macl;
@@ -99,16 +100,19 @@ void sh4_translate_init(void)
"FPR12_BANK1", "FPR13_BANK1", "FPR14_BANK1", "FPR15_BANK1",
};
- if (done_init)
+ if (done_init) {
return;
+ }
cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
tcg_ctx.tcg_env = cpu_env;
- for (i = 0; i < 24; i++)
+ for (i = 0; i < 24; i++) {
cpu_gregs[i] = tcg_global_mem_new_i32(cpu_env,
offsetof(CPUSH4State, gregs[i]),
gregnames[i]);
+ }
+ memcpy(cpu_gregs + 24, cpu_gregs + 8, 8 * sizeof(TCGv));
cpu_pc = tcg_global_mem_new_i32(cpu_env,
offsetof(CPUSH4State, pc), "PC");
@@ -359,13 +363,8 @@ static inline void gen_store_fpr64 (TCGv_i64 t, int reg)
#define B11_8 ((ctx->opcode >> 8) & 0xf)
#define B15_12 ((ctx->opcode >> 12) & 0xf)
-#define REG(x) ((x) < 8 && (ctx->tbflags & (1u << SR_MD))\
- && (ctx->tbflags & (1u << SR_RB))\
- ? (cpu_gregs[x + 16]) : (cpu_gregs[x]))
-
-#define ALTREG(x) ((x) < 8 && (!(ctx->tbflags & (1u << SR_MD))\
- || !(ctx->tbflags & (1u << SR_RB)))\
- ? (cpu_gregs[x + 16]) : (cpu_gregs[x]))
+#define REG(x) cpu_gregs[(x) ^ ctx->gbank]
+#define ALTREG(x) cpu_gregs[(x) ^ ctx->gbank ^ 0x10]
#define FREG(x) (ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x))
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
@@ -2272,6 +2271,8 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
ctx.singlestep_enabled = cs->singlestep_enabled;
ctx.features = env->features;
ctx.has_movcal = (ctx.tbflags & TB_FLAG_PENDING_MOVCA);
+ ctx.gbank = ((ctx.tbflags & (1 << SR_MD)) &&
+ (ctx.tbflags & (1 << SR_RB))) * 0x10;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (9 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 10/27] target/sh4: Hoist register bank selection Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:54 ` Aurelien Jarno
2017-07-08 16:54 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines Richard Henderson
` (16 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We were treating FREG as an index and REG as a TCGv.
Making FREG return a TCGv is both less confusing and
a step toward cleaner banking of cpu_fregs.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 125 ++++++++++++++++++++-----------------------------
1 file changed, 52 insertions(+), 73 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 0ac101e..b521cff 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -366,10 +366,11 @@ static inline void gen_store_fpr64 (TCGv_i64 t, int reg)
#define REG(x) cpu_gregs[(x) ^ ctx->gbank]
#define ALTREG(x) cpu_gregs[(x) ^ ctx->gbank ^ 0x10]
-#define FREG(x) (ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x))
+#define FREG(x) cpu_fregs[ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x)]
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
-#define XREG(x) (ctx->tbflags & FPSCR_FR ? XHACK(x) ^ 0x10 : XHACK(x))
-#define DREG(x) FREG(x) /* Assumes lsb of (x) is always 0 */
+#define XREG(x) FREG(XHACK(x))
+/* Assumes lsb of (x) is always 0 */
+#define DREG(x) (ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x))
#define CHECK_NOT_DELAY_SLOT \
if (ctx->envflags & DELAY_SLOT_MASK) { \
@@ -989,56 +990,51 @@ static void _decode_opc(DisasContext * ctx)
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(fp, XREG(B7_4));
- gen_store_fpr64(fp, XREG(B11_8));
+ gen_load_fpr64(fp, XHACK(B7_4));
+ gen_store_fpr64(fp, XHACK(B11_8));
tcg_temp_free_i64(fp);
} else {
- tcg_gen_mov_i32(cpu_fregs[FREG(B11_8)], cpu_fregs[FREG(B7_4)]);
+ tcg_gen_mov_i32(FREG(B11_8), FREG(B7_4));
}
return;
case 0xf00a: /* fmov {F,D,X}Rm,@Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
TCGv addr_hi = tcg_temp_new();
- int fr = XREG(B7_4);
+ int fr = XHACK(B7_4);
tcg_gen_addi_i32(addr_hi, REG(B11_8), 4);
- tcg_gen_qemu_st_i32(cpu_fregs[fr], REG(B11_8),
- ctx->memidx, MO_TEUL);
- tcg_gen_qemu_st_i32(cpu_fregs[fr+1], addr_hi,
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_st_i32(FREG(fr), REG(B11_8), ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_st_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
tcg_temp_free(addr_hi);
} else {
- tcg_gen_qemu_st_i32(cpu_fregs[FREG(B7_4)], REG(B11_8),
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx, MO_TEUL);
}
return;
case 0xf008: /* fmov @Rm,{F,D,X}Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
TCGv addr_hi = tcg_temp_new();
- int fr = XREG(B11_8);
+ int fr = XHACK(B11_8);
tcg_gen_addi_i32(addr_hi, REG(B7_4), 4);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr], REG(B7_4), ctx->memidx, MO_TEUL);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr_hi, ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(fr), REG(B7_4), ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
tcg_temp_free(addr_hi);
} else {
- tcg_gen_qemu_ld_i32(cpu_fregs[FREG(B11_8)], REG(B7_4),
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
}
return;
case 0xf009: /* fmov @Rm+,{F,D,X}Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
TCGv addr_hi = tcg_temp_new();
- int fr = XREG(B11_8);
+ int fr = XHACK(B11_8);
tcg_gen_addi_i32(addr_hi, REG(B7_4), 4);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr], REG(B7_4), ctx->memidx, MO_TEUL);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr_hi, ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(fr), REG(B7_4), ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 8);
tcg_temp_free(addr_hi);
} else {
- tcg_gen_qemu_ld_i32(cpu_fregs[FREG(B11_8)], REG(B7_4),
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
}
return;
@@ -1047,13 +1043,12 @@ static void _decode_opc(DisasContext * ctx)
TCGv addr = tcg_temp_new_i32();
tcg_gen_subi_i32(addr, REG(B11_8), 4);
if (ctx->tbflags & FPSCR_SZ) {
- int fr = XREG(B7_4);
- tcg_gen_qemu_st_i32(cpu_fregs[fr+1], addr, ctx->memidx, MO_TEUL);
+ int fr = XHACK(B7_4);
+ tcg_gen_qemu_st_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
tcg_gen_subi_i32(addr, addr, 4);
- tcg_gen_qemu_st_i32(cpu_fregs[fr], addr, ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_st_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
} else {
- tcg_gen_qemu_st_i32(cpu_fregs[FREG(B7_4)], addr,
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
}
tcg_gen_mov_i32(REG(B11_8), addr);
tcg_temp_free(addr);
@@ -1064,15 +1059,12 @@ static void _decode_opc(DisasContext * ctx)
TCGv addr = tcg_temp_new_i32();
tcg_gen_add_i32(addr, REG(B7_4), REG(0));
if (ctx->tbflags & FPSCR_SZ) {
- int fr = XREG(B11_8);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr], addr,
- ctx->memidx, MO_TEUL);
+ int fr = XHACK(B11_8);
+ tcg_gen_qemu_ld_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
tcg_gen_addi_i32(addr, addr, 4);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr,
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
} else {
- tcg_gen_qemu_ld_i32(cpu_fregs[FREG(B11_8)], addr,
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx, MO_TEUL);
}
tcg_temp_free(addr);
}
@@ -1083,15 +1075,12 @@ static void _decode_opc(DisasContext * ctx)
TCGv addr = tcg_temp_new();
tcg_gen_add_i32(addr, REG(B11_8), REG(0));
if (ctx->tbflags & FPSCR_SZ) {
- int fr = XREG(B7_4);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr], addr,
- ctx->memidx, MO_TEUL);
+ int fr = XHACK(B7_4);
+ tcg_gen_qemu_ld_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
tcg_gen_addi_i32(addr, addr, 4);
- tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr,
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_ld_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
} else {
- tcg_gen_qemu_st_i32(cpu_fregs[FREG(B7_4)], addr,
- ctx->memidx, MO_TEUL);
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
}
tcg_temp_free(addr);
}
@@ -1139,34 +1128,28 @@ static void _decode_opc(DisasContext * ctx)
} else {
switch (ctx->opcode & 0xf00f) {
case 0xf000: /* fadd Rm,Rn */
- gen_helper_fadd_FT(cpu_fregs[FREG(B11_8)], cpu_env,
- cpu_fregs[FREG(B11_8)],
- cpu_fregs[FREG(B7_4)]);
+ gen_helper_fadd_FT(FREG(B11_8), cpu_env,
+ FREG(B11_8), FREG(B7_4));
break;
case 0xf001: /* fsub Rm,Rn */
- gen_helper_fsub_FT(cpu_fregs[FREG(B11_8)], cpu_env,
- cpu_fregs[FREG(B11_8)],
- cpu_fregs[FREG(B7_4)]);
+ gen_helper_fsub_FT(FREG(B11_8), cpu_env,
+ FREG(B11_8), FREG(B7_4));
break;
case 0xf002: /* fmul Rm,Rn */
- gen_helper_fmul_FT(cpu_fregs[FREG(B11_8)], cpu_env,
- cpu_fregs[FREG(B11_8)],
- cpu_fregs[FREG(B7_4)]);
+ gen_helper_fmul_FT(FREG(B11_8), cpu_env,
+ FREG(B11_8), FREG(B7_4));
break;
case 0xf003: /* fdiv Rm,Rn */
- gen_helper_fdiv_FT(cpu_fregs[FREG(B11_8)], cpu_env,
- cpu_fregs[FREG(B11_8)],
- cpu_fregs[FREG(B7_4)]);
+ gen_helper_fdiv_FT(FREG(B11_8), cpu_env,
+ FREG(B11_8), FREG(B7_4));
break;
case 0xf004: /* fcmp/eq Rm,Rn */
gen_helper_fcmp_eq_FT(cpu_sr_t, cpu_env,
- cpu_fregs[FREG(B11_8)],
- cpu_fregs[FREG(B7_4)]);
+ FREG(B11_8), FREG(B7_4));
return;
case 0xf005: /* fcmp/gt Rm,Rn */
gen_helper_fcmp_gt_FT(cpu_sr_t, cpu_env,
- cpu_fregs[FREG(B11_8)],
- cpu_fregs[FREG(B7_4)]);
+ FREG(B11_8), FREG(B7_4));
return;
}
}
@@ -1178,9 +1161,8 @@ static void _decode_opc(DisasContext * ctx)
if (ctx->tbflags & FPSCR_PR) {
break; /* illegal instruction */
} else {
- gen_helper_fmac_FT(cpu_fregs[FREG(B11_8)], cpu_env,
- cpu_fregs[FREG(0)], cpu_fregs[FREG(B7_4)],
- cpu_fregs[FREG(B11_8)]);
+ gen_helper_fmac_FT(FREG(B11_8), cpu_env,
+ FREG(0), FREG(B7_4), FREG(B11_8));
return;
}
}
@@ -1718,11 +1700,11 @@ static void _decode_opc(DisasContext * ctx)
return;
case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
CHECK_FPU_ENABLED
- tcg_gen_mov_i32(cpu_fregs[FREG(B11_8)], cpu_fpul);
+ tcg_gen_mov_i32(FREG(B11_8), cpu_fpul);
return;
case 0xf01d: /* flds FRm,FPUL - FPSCR: Nothing */
CHECK_FPU_ENABLED
- tcg_gen_mov_i32(cpu_fpul, cpu_fregs[FREG(B11_8)]);
+ tcg_gen_mov_i32(cpu_fpul, FREG(B11_8));
return;
case 0xf02d: /* float FPUL,FRn/DRn - FPSCR: R[PR,Enable.I]/W[Cause,Flag] */
CHECK_FPU_ENABLED
@@ -1736,7 +1718,7 @@ static void _decode_opc(DisasContext * ctx)
tcg_temp_free_i64(fp);
}
else {
- gen_helper_float_FT(cpu_fregs[FREG(B11_8)], cpu_env, cpu_fpul);
+ gen_helper_float_FT(FREG(B11_8), cpu_env, cpu_fpul);
}
return;
case 0xf03d: /* ftrc FRm/DRm,FPUL - FPSCR: R[PR,Enable.V]/W[Cause,Flag] */
@@ -1751,18 +1733,16 @@ static void _decode_opc(DisasContext * ctx)
tcg_temp_free_i64(fp);
}
else {
- gen_helper_ftrc_FT(cpu_fpul, cpu_env, cpu_fregs[FREG(B11_8)]);
+ gen_helper_ftrc_FT(cpu_fpul, cpu_env, FREG(B11_8));
}
return;
case 0xf04d: /* fneg FRn/DRn - FPSCR: Nothing */
CHECK_FPU_ENABLED
- tcg_gen_xori_i32(cpu_fregs[FREG(B11_8)], cpu_fregs[FREG(B11_8)],
- 0x80000000);
+ tcg_gen_xori_i32(FREG(B11_8), FREG(B11_8), 0x80000000);
return;
case 0xf05d: /* fabs FRn/DRn - FPCSR: Nothing */
CHECK_FPU_ENABLED
- tcg_gen_andi_i32(cpu_fregs[FREG(B11_8)], cpu_fregs[FREG(B11_8)],
- 0x7fffffff);
+ tcg_gen_andi_i32(FREG(B11_8), FREG(B11_8), 0x7fffffff);
return;
case 0xf06d: /* fsqrt FRn */
CHECK_FPU_ENABLED
@@ -1775,8 +1755,7 @@ static void _decode_opc(DisasContext * ctx)
gen_store_fpr64(fp, DREG(B11_8));
tcg_temp_free_i64(fp);
} else {
- gen_helper_fsqrt_FT(cpu_fregs[FREG(B11_8)], cpu_env,
- cpu_fregs[FREG(B11_8)]);
+ gen_helper_fsqrt_FT(FREG(B11_8), cpu_env, FREG(B11_8));
}
return;
case 0xf07d: /* fsrra FRn */
@@ -1785,13 +1764,13 @@ static void _decode_opc(DisasContext * ctx)
case 0xf08d: /* fldi0 FRn - FPSCR: R[PR] */
CHECK_FPU_ENABLED
if (!(ctx->tbflags & FPSCR_PR)) {
- tcg_gen_movi_i32(cpu_fregs[FREG(B11_8)], 0);
+ tcg_gen_movi_i32(FREG(B11_8), 0);
}
return;
case 0xf09d: /* fldi1 FRn - FPSCR: R[PR] */
CHECK_FPU_ENABLED
if (!(ctx->tbflags & FPSCR_PR)) {
- tcg_gen_movi_i32(cpu_fregs[FREG(B11_8)], 0x3f800000);
+ tcg_gen_movi_i32(FREG(B11_8), 0x3f800000);
}
return;
case 0xf0ad: /* fcnvsd FPUL,DRn */
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (10 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:55 ` Aurelien Jarno
2017-07-08 16:56 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 13/27] target/sh4: Hoist fp register bank selection Richard Henderson
` (15 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index b521cff..878c0bd 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -343,12 +343,12 @@ static void gen_delayed_conditional_jump(DisasContext * ctx)
gen_jump(ctx);
}
-static inline void gen_load_fpr64(TCGv_i64 t, int reg)
+static inline void gen_load_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
{
tcg_gen_concat_i32_i64(t, cpu_fregs[reg + 1], cpu_fregs[reg]);
}
-static inline void gen_store_fpr64 (TCGv_i64 t, int reg)
+static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
{
tcg_gen_extr_i64_i32(cpu_fregs[reg + 1], cpu_fregs[reg], t);
}
@@ -990,8 +990,8 @@ static void _decode_opc(DisasContext * ctx)
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(fp, XHACK(B7_4));
- gen_store_fpr64(fp, XHACK(B11_8));
+ gen_load_fpr64(ctx, fp, XHACK(B7_4));
+ gen_store_fpr64(ctx, fp, XHACK(B11_8));
tcg_temp_free_i64(fp);
} else {
tcg_gen_mov_i32(FREG(B11_8), FREG(B7_4));
@@ -1100,8 +1100,8 @@ static void _decode_opc(DisasContext * ctx)
break; /* illegal instruction */
fp0 = tcg_temp_new_i64();
fp1 = tcg_temp_new_i64();
- gen_load_fpr64(fp0, DREG(B11_8));
- gen_load_fpr64(fp1, DREG(B7_4));
+ gen_load_fpr64(ctx, fp0, DREG(B11_8));
+ gen_load_fpr64(ctx, fp1, DREG(B7_4));
switch (ctx->opcode & 0xf00f) {
case 0xf000: /* fadd Rm,Rn */
gen_helper_fadd_DT(fp0, cpu_env, fp0, fp1);
@@ -1122,7 +1122,7 @@ static void _decode_opc(DisasContext * ctx)
gen_helper_fcmp_gt_DT(cpu_sr_t, cpu_env, fp0, fp1);
return;
}
- gen_store_fpr64(fp0, DREG(B11_8));
+ gen_store_fpr64(ctx, fp0, DREG(B11_8));
tcg_temp_free_i64(fp0);
tcg_temp_free_i64(fp1);
} else {
@@ -1714,7 +1714,7 @@ static void _decode_opc(DisasContext * ctx)
break; /* illegal instruction */
fp = tcg_temp_new_i64();
gen_helper_float_DT(fp, cpu_env, cpu_fpul);
- gen_store_fpr64(fp, DREG(B11_8));
+ gen_store_fpr64(ctx, fp, DREG(B11_8));
tcg_temp_free_i64(fp);
}
else {
@@ -1728,7 +1728,7 @@ static void _decode_opc(DisasContext * ctx)
if (ctx->opcode & 0x0100)
break; /* illegal instruction */
fp = tcg_temp_new_i64();
- gen_load_fpr64(fp, DREG(B11_8));
+ gen_load_fpr64(ctx, fp, DREG(B11_8));
gen_helper_ftrc_DT(cpu_fpul, cpu_env, fp);
tcg_temp_free_i64(fp);
}
@@ -1750,9 +1750,9 @@ static void _decode_opc(DisasContext * ctx)
if (ctx->opcode & 0x0100)
break; /* illegal instruction */
TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(fp, DREG(B11_8));
+ gen_load_fpr64(ctx, fp, DREG(B11_8));
gen_helper_fsqrt_DT(fp, cpu_env, fp);
- gen_store_fpr64(fp, DREG(B11_8));
+ gen_store_fpr64(ctx, fp, DREG(B11_8));
tcg_temp_free_i64(fp);
} else {
gen_helper_fsqrt_FT(FREG(B11_8), cpu_env, FREG(B11_8));
@@ -1778,7 +1778,7 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv_i64 fp = tcg_temp_new_i64();
gen_helper_fcnvsd_FT_DT(fp, cpu_env, cpu_fpul);
- gen_store_fpr64(fp, DREG(B11_8));
+ gen_store_fpr64(ctx, fp, DREG(B11_8));
tcg_temp_free_i64(fp);
}
return;
@@ -1786,7 +1786,7 @@ static void _decode_opc(DisasContext * ctx)
CHECK_FPU_ENABLED
{
TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(fp, DREG(B11_8));
+ gen_load_fpr64(ctx, fp, DREG(B11_8));
gen_helper_fcnvds_DT_FT(cpu_fpul, cpu_env, fp);
tcg_temp_free_i64(fp);
}
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 13/27] target/sh4: Hoist fp register bank selection
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (11 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:57 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 14/27] target/sh4: Eliminate unused XREG macro Richard Henderson
` (14 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Compute which register bank to use once at the start of translation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 878c0bd..fc743da 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -42,6 +42,7 @@ typedef struct DisasContext {
int bstate;
int memidx;
int gbank;
+ int fbank;
uint32_t delayed_pc;
int singlestep_enabled;
uint32_t features;
@@ -365,12 +366,12 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
#define REG(x) cpu_gregs[(x) ^ ctx->gbank]
#define ALTREG(x) cpu_gregs[(x) ^ ctx->gbank ^ 0x10]
+#define FREG(x) cpu_fregs[(x) ^ ctx->fbank]
-#define FREG(x) cpu_fregs[ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x)]
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
-#define XREG(x) FREG(XHACK(x))
+#define XREG(x) FREG(XHACK(x))
/* Assumes lsb of (x) is always 0 */
-#define DREG(x) (ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x))
+#define DREG(x) ((x) ^ ctx->fbank)
#define CHECK_NOT_DELAY_SLOT \
if (ctx->envflags & DELAY_SLOT_MASK) { \
@@ -2252,6 +2253,7 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
ctx.has_movcal = (ctx.tbflags & TB_FLAG_PENDING_MOVCA);
ctx.gbank = ((ctx.tbflags & (1 << SR_MD)) &&
(ctx.tbflags & (1 << SR_RB))) * 0x10;
+ ctx.fbank = ctx.tbflags & FPSCR_FR ? 0x10 : 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 14/27] target/sh4: Eliminate unused XREG macro
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (12 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 13/27] target/sh4: Hoist fp register bank selection Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 21:59 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 15/27] target/sh4: Merge DREG into fpr64 routines Richard Henderson
` (13 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index fc743da..b6c3ff9 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -369,7 +369,6 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
#define FREG(x) cpu_fregs[(x) ^ ctx->fbank]
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
-#define XREG(x) FREG(XHACK(x))
/* Assumes lsb of (x) is always 0 */
#define DREG(x) ((x) ^ ctx->fbank)
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 15/27] target/sh4: Merge DREG into fpr64 routines
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (13 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 14/27] target/sh4: Eliminate unused XREG macro Richard Henderson
@ 2017-07-07 2:20 ` Richard Henderson
2017-07-07 22:06 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 16/27] target/sh4: Load/store Dr as 64-bit quantities Richard Henderson
` (12 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:20 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Also add a debugging assert that we did signal illegal opc
for odd double-precision registers.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index b6c3ff9..616e615 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -346,11 +346,17 @@ static void gen_delayed_conditional_jump(DisasContext * ctx)
static inline void gen_load_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
{
+ /* We have already signaled illegal instruction for odd Dr. */
+ tcg_debug_assert((reg & 1) == 0);
+ reg ^= ctx->fbank;
tcg_gen_concat_i32_i64(t, cpu_fregs[reg + 1], cpu_fregs[reg]);
}
static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
{
+ /* We have already signaled illegal instruction for odd Dr. */
+ tcg_debug_assert((reg & 1) == 0);
+ reg ^= ctx->fbank;
tcg_gen_extr_i64_i32(cpu_fregs[reg + 1], cpu_fregs[reg], t);
}
@@ -369,8 +375,6 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
#define FREG(x) cpu_fregs[(x) ^ ctx->fbank]
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
-/* Assumes lsb of (x) is always 0 */
-#define DREG(x) ((x) ^ ctx->fbank)
#define CHECK_NOT_DELAY_SLOT \
if (ctx->envflags & DELAY_SLOT_MASK) { \
@@ -1100,8 +1104,8 @@ static void _decode_opc(DisasContext * ctx)
break; /* illegal instruction */
fp0 = tcg_temp_new_i64();
fp1 = tcg_temp_new_i64();
- gen_load_fpr64(ctx, fp0, DREG(B11_8));
- gen_load_fpr64(ctx, fp1, DREG(B7_4));
+ gen_load_fpr64(ctx, fp0, B11_8);
+ gen_load_fpr64(ctx, fp1, B7_4);
switch (ctx->opcode & 0xf00f) {
case 0xf000: /* fadd Rm,Rn */
gen_helper_fadd_DT(fp0, cpu_env, fp0, fp1);
@@ -1122,7 +1126,7 @@ static void _decode_opc(DisasContext * ctx)
gen_helper_fcmp_gt_DT(cpu_sr_t, cpu_env, fp0, fp1);
return;
}
- gen_store_fpr64(ctx, fp0, DREG(B11_8));
+ gen_store_fpr64(ctx, fp0, B11_8);
tcg_temp_free_i64(fp0);
tcg_temp_free_i64(fp1);
} else {
@@ -1714,7 +1718,7 @@ static void _decode_opc(DisasContext * ctx)
break; /* illegal instruction */
fp = tcg_temp_new_i64();
gen_helper_float_DT(fp, cpu_env, cpu_fpul);
- gen_store_fpr64(ctx, fp, DREG(B11_8));
+ gen_store_fpr64(ctx, fp, B11_8);
tcg_temp_free_i64(fp);
}
else {
@@ -1728,7 +1732,7 @@ static void _decode_opc(DisasContext * ctx)
if (ctx->opcode & 0x0100)
break; /* illegal instruction */
fp = tcg_temp_new_i64();
- gen_load_fpr64(ctx, fp, DREG(B11_8));
+ gen_load_fpr64(ctx, fp, B11_8);
gen_helper_ftrc_DT(cpu_fpul, cpu_env, fp);
tcg_temp_free_i64(fp);
}
@@ -1750,9 +1754,9 @@ static void _decode_opc(DisasContext * ctx)
if (ctx->opcode & 0x0100)
break; /* illegal instruction */
TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(ctx, fp, DREG(B11_8));
+ gen_load_fpr64(ctx, fp, B11_8);
gen_helper_fsqrt_DT(fp, cpu_env, fp);
- gen_store_fpr64(ctx, fp, DREG(B11_8));
+ gen_store_fpr64(ctx, fp, B11_8);
tcg_temp_free_i64(fp);
} else {
gen_helper_fsqrt_FT(FREG(B11_8), cpu_env, FREG(B11_8));
@@ -1778,7 +1782,7 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv_i64 fp = tcg_temp_new_i64();
gen_helper_fcnvsd_FT_DT(fp, cpu_env, cpu_fpul);
- gen_store_fpr64(ctx, fp, DREG(B11_8));
+ gen_store_fpr64(ctx, fp, B11_8);
tcg_temp_free_i64(fp);
}
return;
@@ -1786,7 +1790,7 @@ static void _decode_opc(DisasContext * ctx)
CHECK_FPU_ENABLED
{
TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(ctx, fp, DREG(B11_8));
+ gen_load_fpr64(ctx, fp, B11_8);
gen_helper_fcnvds_DT_FT(cpu_fpul, cpu_env, fp);
tcg_temp_free_i64(fp);
}
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 16/27] target/sh4: Load/store Dr as 64-bit quantities
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (14 preceding siblings ...)
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 15/27] target/sh4: Merge DREG into fpr64 routines Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:14 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 17/27] target/sh4: Simplify 64-bit fp reg-reg move Richard Henderson
` (11 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
This enforces proper alignment and makes the register update
more natural. Note that there is a more serious bug fix for
fmov {DX}Rn,@(R0,Rn) to use a store instead of a load.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 74 ++++++++++++++++++++++++--------------------------
1 file changed, 35 insertions(+), 39 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 616e615..fcdabe8 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -1004,12 +1004,10 @@ static void _decode_opc(DisasContext * ctx)
case 0xf00a: /* fmov {F,D,X}Rm,@Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
- TCGv addr_hi = tcg_temp_new();
- int fr = XHACK(B7_4);
- tcg_gen_addi_i32(addr_hi, REG(B11_8), 4);
- tcg_gen_qemu_st_i32(FREG(fr), REG(B11_8), ctx->memidx, MO_TEUL);
- tcg_gen_qemu_st_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
- tcg_temp_free(addr_hi);
+ TCGv_i64 fp = tcg_temp_new_i64();
+ gen_load_fpr64(ctx, fp, XHACK(B7_4));
+ tcg_gen_qemu_st_i64(fp, REG(B11_8), ctx->memidx, MO_TEQ);
+ tcg_temp_free_i64(fp);
} else {
tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx, MO_TEUL);
}
@@ -1017,12 +1015,10 @@ static void _decode_opc(DisasContext * ctx)
case 0xf008: /* fmov @Rm,{F,D,X}Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
- TCGv addr_hi = tcg_temp_new();
- int fr = XHACK(B11_8);
- tcg_gen_addi_i32(addr_hi, REG(B7_4), 4);
- tcg_gen_qemu_ld_i32(FREG(fr), REG(B7_4), ctx->memidx, MO_TEUL);
- tcg_gen_qemu_ld_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
- tcg_temp_free(addr_hi);
+ TCGv_i64 fp = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx, MO_TEQ);
+ gen_store_fpr64(ctx, fp, XHACK(B11_8));
+ tcg_temp_free_i64(fp);
} else {
tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
}
@@ -1030,13 +1026,11 @@ static void _decode_opc(DisasContext * ctx)
case 0xf009: /* fmov @Rm+,{F,D,X}Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
- TCGv addr_hi = tcg_temp_new();
- int fr = XHACK(B11_8);
- tcg_gen_addi_i32(addr_hi, REG(B7_4), 4);
- tcg_gen_qemu_ld_i32(FREG(fr), REG(B7_4), ctx->memidx, MO_TEUL);
- tcg_gen_qemu_ld_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
- tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 8);
- tcg_temp_free(addr_hi);
+ TCGv_i64 fp = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx, MO_TEQ);
+ gen_store_fpr64(ctx, fp, XHACK(B11_8));
+ tcg_temp_free_i64(fp);
+ tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 8);
} else {
tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
@@ -1044,18 +1038,20 @@ static void _decode_opc(DisasContext * ctx)
return;
case 0xf00b: /* fmov {F,D,X}Rm,@-Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
- TCGv addr = tcg_temp_new_i32();
- tcg_gen_subi_i32(addr, REG(B11_8), 4);
- if (ctx->tbflags & FPSCR_SZ) {
- int fr = XHACK(B7_4);
- tcg_gen_qemu_st_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
- tcg_gen_subi_i32(addr, addr, 4);
- tcg_gen_qemu_st_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
- } else {
- tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
- }
- tcg_gen_mov_i32(REG(B11_8), addr);
- tcg_temp_free(addr);
+ {
+ TCGv addr = tcg_temp_new_i32();
+ if (ctx->tbflags & FPSCR_SZ) {
+ TCGv_i64 fp = tcg_temp_new_i64();
+ gen_load_fpr64(ctx, fp, XHACK(B7_4));
+ tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEQ);
+ tcg_temp_free_i64(fp);
+ } else {
+ tcg_gen_subi_i32(addr, REG(B11_8), 4);
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
+ }
+ tcg_gen_mov_i32(REG(B11_8), addr);
+ tcg_temp_free(addr);
+ }
return;
case 0xf006: /* fmov @(R0,Rm),{F,D,X}Rm - FPSCR: Nothing */
CHECK_FPU_ENABLED
@@ -1063,10 +1059,10 @@ static void _decode_opc(DisasContext * ctx)
TCGv addr = tcg_temp_new_i32();
tcg_gen_add_i32(addr, REG(B7_4), REG(0));
if (ctx->tbflags & FPSCR_SZ) {
- int fr = XHACK(B11_8);
- tcg_gen_qemu_ld_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
- tcg_gen_addi_i32(addr, addr, 4);
- tcg_gen_qemu_ld_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
+ TCGv_i64 fp = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(fp, addr, ctx->memidx, MO_TEQ);
+ gen_store_fpr64(ctx, fp, XHACK(B11_8));
+ tcg_temp_free_i64(fp);
} else {
tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx, MO_TEUL);
}
@@ -1079,10 +1075,10 @@ static void _decode_opc(DisasContext * ctx)
TCGv addr = tcg_temp_new();
tcg_gen_add_i32(addr, REG(B11_8), REG(0));
if (ctx->tbflags & FPSCR_SZ) {
- int fr = XHACK(B7_4);
- tcg_gen_qemu_ld_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
- tcg_gen_addi_i32(addr, addr, 4);
- tcg_gen_qemu_ld_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
+ TCGv_i64 fp = tcg_temp_new_i64();
+ gen_load_fpr64(ctx, fp, XHACK(B7_4));
+ tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEQ);
+ tcg_temp_free_i64(fp);
} else {
tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
}
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 17/27] target/sh4: Simplify 64-bit fp reg-reg move
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (15 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 16/27] target/sh4: Load/store Dr as 64-bit quantities Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:15 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT Richard Henderson
` (10 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We do not need to form full 64-bit quantities in order to perform
the move. This reduces code expansion on 64-bit hosts.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index fcdabe8..3453f19 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -993,10 +993,10 @@ static void _decode_opc(DisasContext * ctx)
case 0xf00c: /* fmov {F,D,X}Rm,{F,D,X}Rn - FPSCR: Nothing */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_SZ) {
- TCGv_i64 fp = tcg_temp_new_i64();
- gen_load_fpr64(ctx, fp, XHACK(B7_4));
- gen_store_fpr64(ctx, fp, XHACK(B11_8));
- tcg_temp_free_i64(fp);
+ int xsrc = XHACK(B7_4);
+ int xdst = XHACK(B11_8);
+ tcg_gen_mov_i32(FREG(xdst), FREG(xsrc));
+ tcg_gen_mov_i32(FREG(xdst + 1), FREG(xsrc + 1));
} else {
tcg_gen_mov_i32(FREG(B11_8), FREG(B7_4));
}
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (16 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 17/27] target/sh4: Simplify 64-bit fp reg-reg move Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:17 ` Aurelien Jarno
2017-07-08 16:59 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED Richard Henderson
` (9 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We do not need to emit N copies of raising an exception.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 3453f19..41157a0 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -377,11 +377,8 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
#define CHECK_NOT_DELAY_SLOT \
- if (ctx->envflags & DELAY_SLOT_MASK) { \
- gen_save_cpu_state(ctx, true); \
- gen_helper_raise_slot_illegal_instruction(cpu_env); \
- ctx->bstate = BS_EXCP; \
- return; \
+ if (ctx->envflags & DELAY_SLOT_MASK) { \
+ goto do_illegal_slot; \
}
#define CHECK_PRIVILEGED \
@@ -1820,10 +1817,12 @@ static void _decode_opc(DisasContext * ctx)
ctx->opcode, ctx->pc);
fflush(stderr);
#endif
- gen_save_cpu_state(ctx, true);
if (ctx->envflags & DELAY_SLOT_MASK) {
+ do_illegal_slot:
+ gen_save_cpu_state(ctx, true);
gen_helper_raise_slot_illegal_instruction(cpu_env);
} else {
+ gen_save_cpu_state(ctx, true);
gen_helper_raise_illegal_instruction(cpu_env);
}
ctx->bstate = BS_EXCP;
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (17 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:17 ` Aurelien Jarno
2017-07-08 17:00 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED Richard Henderson
` (8 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We do not need to emit N copies of raising an exception.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 41157a0..dd14b43 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -381,16 +381,9 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
goto do_illegal_slot; \
}
-#define CHECK_PRIVILEGED \
- if (IS_USER(ctx)) { \
- gen_save_cpu_state(ctx, true); \
- if (ctx->envflags & DELAY_SLOT_MASK) { \
- gen_helper_raise_slot_illegal_instruction(cpu_env); \
- } else { \
- gen_helper_raise_illegal_instruction(cpu_env); \
- } \
- ctx->bstate = BS_EXCP; \
- return; \
+#define CHECK_PRIVILEGED \
+ if (IS_USER(ctx)) { \
+ goto do_illegal; \
}
#define CHECK_FPU_ENABLED \
@@ -1817,6 +1810,7 @@ static void _decode_opc(DisasContext * ctx)
ctx->opcode, ctx->pc);
fflush(stderr);
#endif
+ do_illegal:
if (ctx->envflags & DELAY_SLOT_MASK) {
do_illegal_slot:
gen_save_cpu_state(ctx, true);
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (18 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:18 ` Aurelien Jarno
2017-07-08 17:01 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks Richard Henderson
` (7 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
We do not need to emit N copies of raising an exception.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index dd14b43..a4370c6 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -386,16 +386,9 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
goto do_illegal; \
}
-#define CHECK_FPU_ENABLED \
- if (ctx->tbflags & (1u << SR_FD)) { \
- gen_save_cpu_state(ctx, true); \
- if (ctx->envflags & DELAY_SLOT_MASK) { \
- gen_helper_raise_slot_fpu_disable(cpu_env); \
- } else { \
- gen_helper_raise_fpu_disable(cpu_env); \
- } \
- ctx->bstate = BS_EXCP; \
- return; \
+#define CHECK_FPU_ENABLED \
+ if (ctx->tbflags & (1u << SR_FD)) { \
+ goto do_fpu_disabled; \
}
static void _decode_opc(DisasContext * ctx)
@@ -1820,6 +1813,17 @@ static void _decode_opc(DisasContext * ctx)
gen_helper_raise_illegal_instruction(cpu_env);
}
ctx->bstate = BS_EXCP;
+ return;
+
+ do_fpu_disabled:
+ gen_save_cpu_state(ctx, true);
+ if (ctx->envflags & DELAY_SLOT_MASK) {
+ gen_helper_raise_slot_fpu_disable(cpu_env);
+ } else {
+ gen_helper_raise_fpu_disable(cpu_env);
+ }
+ ctx->bstate = BS_EXCP;
+ return;
}
static void decode_opc(DisasContext * ctx)
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (19 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:18 ` Aurelien Jarno
2017-07-08 17:02 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 22/27] target/sh4: Introduce CHECK_FPSCR_PR_* Richard Henderson
` (6 subsequent siblings)
27 siblings, 2 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Now that we have a do_illegal label, use goto in order
to self-document the forcing of the exception.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index a4370c6..06cf649 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -1079,8 +1079,9 @@ static void _decode_opc(DisasContext * ctx)
if (ctx->tbflags & FPSCR_PR) {
TCGv_i64 fp0, fp1;
- if (ctx->opcode & 0x0110)
- break; /* illegal instruction */
+ if (ctx->opcode & 0x0110) {
+ goto do_illegal;
+ }
fp0 = tcg_temp_new_i64();
fp1 = tcg_temp_new_i64();
gen_load_fpr64(ctx, fp0, B11_8);
@@ -1142,7 +1143,7 @@ static void _decode_opc(DisasContext * ctx)
{
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_PR) {
- break; /* illegal instruction */
+ goto do_illegal;
} else {
gen_helper_fmac_FT(FREG(B11_8), cpu_env,
FREG(0), FREG(B7_4), FREG(B11_8));
@@ -1693,8 +1694,9 @@ static void _decode_opc(DisasContext * ctx)
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_PR) {
TCGv_i64 fp;
- if (ctx->opcode & 0x0100)
- break; /* illegal instruction */
+ if (ctx->opcode & 0x0100) {
+ goto do_illegal;
+ }
fp = tcg_temp_new_i64();
gen_helper_float_DT(fp, cpu_env, cpu_fpul);
gen_store_fpr64(ctx, fp, B11_8);
@@ -1708,8 +1710,9 @@ static void _decode_opc(DisasContext * ctx)
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_PR) {
TCGv_i64 fp;
- if (ctx->opcode & 0x0100)
- break; /* illegal instruction */
+ if (ctx->opcode & 0x0100) {
+ goto do_illegal;
+ }
fp = tcg_temp_new_i64();
gen_load_fpr64(ctx, fp, B11_8);
gen_helper_ftrc_DT(cpu_fpul, cpu_env, fp);
@@ -1730,8 +1733,9 @@ static void _decode_opc(DisasContext * ctx)
case 0xf06d: /* fsqrt FRn */
CHECK_FPU_ENABLED
if (ctx->tbflags & FPSCR_PR) {
- if (ctx->opcode & 0x0100)
- break; /* illegal instruction */
+ if (ctx->opcode & 0x0100) {
+ goto do_illegal;
+ }
TCGv_i64 fp = tcg_temp_new_i64();
gen_load_fpr64(ctx, fp, B11_8);
gen_helper_fsqrt_DT(fp, cpu_env, fp);
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 22/27] target/sh4: Introduce CHECK_FPSCR_PR_*
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (20 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:20 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 23/27] target/sh4: Introduce CHECK_SH4A Richard Henderson
` (5 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 57 +++++++++++++++++++++++++++-----------------------
1 file changed, 31 insertions(+), 26 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 06cf649..3d8ac59 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -391,6 +391,16 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
goto do_fpu_disabled; \
}
+#define CHECK_FPSCR_PR_0 \
+ if (ctx->tbflags & FPSCR_PR) { \
+ goto do_illegal; \
+ }
+
+#define CHECK_FPSCR_PR_1 \
+ if (!(ctx->tbflags & FPSCR_PR)) { \
+ goto do_illegal; \
+ }
+
static void _decode_opc(DisasContext * ctx)
{
/* This code tries to make movcal emulation sufficiently
@@ -1140,16 +1150,11 @@ static void _decode_opc(DisasContext * ctx)
}
return;
case 0xf00e: /* fmac FR0,RM,Rn */
- {
- CHECK_FPU_ENABLED
- if (ctx->tbflags & FPSCR_PR) {
- goto do_illegal;
- } else {
- gen_helper_fmac_FT(FREG(B11_8), cpu_env,
- FREG(0), FREG(B7_4), FREG(B11_8));
- return;
- }
- }
+ CHECK_FPU_ENABLED
+ CHECK_FPSCR_PR_0
+ gen_helper_fmac_FT(FREG(B11_8), cpu_env,
+ FREG(0), FREG(B7_4), FREG(B11_8));
+ return;
}
switch (ctx->opcode & 0xff00) {
@@ -1750,16 +1755,14 @@ static void _decode_opc(DisasContext * ctx)
break;
case 0xf08d: /* fldi0 FRn - FPSCR: R[PR] */
CHECK_FPU_ENABLED
- if (!(ctx->tbflags & FPSCR_PR)) {
- tcg_gen_movi_i32(FREG(B11_8), 0);
- }
- return;
+ CHECK_FPSCR_PR_0
+ tcg_gen_movi_i32(FREG(B11_8), 0);
+ return;
case 0xf09d: /* fldi1 FRn - FPSCR: R[PR] */
CHECK_FPU_ENABLED
- if (!(ctx->tbflags & FPSCR_PR)) {
- tcg_gen_movi_i32(FREG(B11_8), 0x3f800000);
- }
- return;
+ CHECK_FPSCR_PR_0
+ tcg_gen_movi_i32(FREG(B11_8), 0x3f800000);
+ return;
case 0xf0ad: /* fcnvsd FPUL,DRn */
CHECK_FPU_ENABLED
{
@@ -1780,10 +1783,10 @@ static void _decode_opc(DisasContext * ctx)
return;
case 0xf0ed: /* fipr FVm,FVn */
CHECK_FPU_ENABLED
- if ((ctx->tbflags & FPSCR_PR) == 0) {
- TCGv m, n;
- m = tcg_const_i32((ctx->opcode >> 8) & 3);
- n = tcg_const_i32((ctx->opcode >> 10) & 3);
+ CHECK_FPSCR_PR_1
+ {
+ TCGv m = tcg_const_i32((ctx->opcode >> 8) & 3);
+ TCGv n = tcg_const_i32((ctx->opcode >> 10) & 3);
gen_helper_fipr(cpu_env, m, n);
tcg_temp_free(m);
tcg_temp_free(n);
@@ -1792,10 +1795,12 @@ static void _decode_opc(DisasContext * ctx)
break;
case 0xf0fd: /* ftrv XMTRX,FVn */
CHECK_FPU_ENABLED
- if ((ctx->opcode & 0x0300) == 0x0100 &&
- (ctx->tbflags & FPSCR_PR) == 0) {
- TCGv n;
- n = tcg_const_i32((ctx->opcode >> 10) & 3);
+ CHECK_FPSCR_PR_1
+ {
+ if ((ctx->opcode & 0x0300) != 0x0100) {
+ goto do_illegal;
+ }
+ TCGv n = tcg_const_i32((ctx->opcode >> 10) & 3);
gen_helper_ftrv(cpu_env, n);
tcg_temp_free(n);
return;
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 23/27] target/sh4: Introduce CHECK_SH4A
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (21 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 22/27] target/sh4: Introduce CHECK_FPSCR_PR_* Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:21 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 24/27] target/sh4: Implement fpchg Richard Henderson
` (4 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 64 +++++++++++++++++++++++---------------------------
1 file changed, 29 insertions(+), 35 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 3d8ac59..d164e62 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -401,6 +401,11 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
goto do_illegal; \
}
+#define CHECK_SH4A \
+ if (!(ctx->features & SH_FEATURE_SH4A)) { \
+ goto do_illegal; \
+ }
+
static void _decode_opc(DisasContext * ctx)
{
/* This code tries to make movcal emulation sufficiently
@@ -1478,7 +1483,7 @@ static void _decode_opc(DisasContext * ctx)
LDST(ssr, 0x403e, 0x4037, 0x0032, 0x4033, CHECK_PRIVILEGED)
LDST(spc, 0x404e, 0x4047, 0x0042, 0x4043, CHECK_PRIVILEGED)
ST(sgr, 0x003a, 0x4032, CHECK_PRIVILEGED)
- LD(sgr, 0x403a, 0x4036, CHECK_PRIVILEGED if (!(ctx->features & SH_FEATURE_SH4A)) break;)
+ LD(sgr, 0x403a, 0x4036, CHECK_PRIVILEGED CHECK_SH4A)
LDST(dbr, 0x40fa, 0x40f6, 0x00fa, 0x40f2, CHECK_PRIVILEGED)
LDST(mach, 0x400a, 0x4006, 0x000a, 0x4002, {})
LDST(macl, 0x401a, 0x4016, 0x001a, 0x4012, {})
@@ -1528,21 +1533,19 @@ static void _decode_opc(DisasContext * ctx)
ctx->has_movcal = 1;
return;
case 0x40a9: /* movua.l @Rm,R0 */
+ CHECK_SH4A
/* Load non-boundary-aligned data */
- if (ctx->features & SH_FEATURE_SH4A) {
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
- MO_TEUL | MO_UNALN);
- return;
- }
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
+ MO_TEUL | MO_UNALN);
+ return;
break;
case 0x40e9: /* movua.l @Rm+,R0 */
+ CHECK_SH4A
/* Load non-boundary-aligned data */
- if (ctx->features & SH_FEATURE_SH4A) {
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
- MO_TEUL | MO_UNALN);
- tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
- return;
- }
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
+ MO_TEUL | MO_UNALN);
+ tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
+ return;
break;
case 0x0029: /* movt Rn */
tcg_gen_mov_i32(REG(B11_8), cpu_sr_t);
@@ -1553,7 +1556,8 @@ static void _decode_opc(DisasContext * ctx)
If (T == 1) R0 -> (Rn)
0 -> LDST
*/
- if (ctx->features & SH_FEATURE_SH4A) {
+ CHECK_SH4A
+ {
TCGLabel *fail = gen_new_label();
TCGLabel *done = gen_new_label();
TCGv tmp;
@@ -1572,8 +1576,6 @@ static void _decode_opc(DisasContext * ctx)
gen_set_label(done);
return;
- } else {
- break;
}
case 0x0063:
/* MOVLI.L @Rm,R0
@@ -1582,14 +1584,11 @@ static void _decode_opc(DisasContext * ctx)
When interrupt/exception
occurred 0 -> LDST
*/
- if (ctx->features & SH_FEATURE_SH4A) {
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
- tcg_gen_mov_i32(cpu_lock_addr, REG(B11_8));
- tcg_gen_mov_i32(cpu_lock_value, REG(0));
- return;
- } else {
- break;
- }
+ CHECK_SH4A
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
+ tcg_gen_mov_i32(cpu_lock_addr, REG(B11_8));
+ tcg_gen_mov_i32(cpu_lock_value, REG(0));
+ return;
case 0x0093: /* ocbi @Rn */
{
gen_helper_ocbi(cpu_env, REG(B11_8));
@@ -1604,20 +1603,15 @@ static void _decode_opc(DisasContext * ctx)
case 0x0083: /* pref @Rn */
return;
case 0x00d3: /* prefi @Rn */
- if (ctx->features & SH_FEATURE_SH4A)
- return;
- else
- break;
+ CHECK_SH4A
+ return;
case 0x00e3: /* icbi @Rn */
- if (ctx->features & SH_FEATURE_SH4A)
- return;
- else
- break;
+ CHECK_SH4A
+ return;
case 0x00ab: /* synco */
- if (ctx->features & SH_FEATURE_SH4A) {
- tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
- return;
- }
+ CHECK_SH4A
+ tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+ return;
break;
case 0x4024: /* rotcl Rn */
{
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 24/27] target/sh4: Implement fpchg
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (22 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 23/27] target/sh4: Introduce CHECK_SH4A Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:23 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 25/27] target/sh4: Add missing FPSCR.PR == 0 checks Richard Henderson
` (3 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index d164e62..35a5c91 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -492,6 +492,11 @@ static void _decode_opc(DisasContext * ctx)
tcg_gen_xori_i32(cpu_fpscr, cpu_fpscr, FPSCR_SZ);
ctx->bstate = BS_STOP;
return;
+ case 0xf7fd: /* fpchg */
+ CHECK_SH4A
+ tcg_gen_xori_i32(cpu_fpscr, cpu_fpscr, FPSCR_PR);
+ ctx->bstate = BS_STOP;
+ return;
case 0x0009: /* nop */
return;
case 0x001b: /* sleep */
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 25/27] target/sh4: Add missing FPSCR.PR == 0 checks
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (23 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 24/27] target/sh4: Implement fpchg Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:24 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 26/27] target/sh4: Implement fsrra Richard Henderson
` (2 subsequent siblings)
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Both frchg and fschg require PR == 0, otherwise undefined_operation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 35a5c91..2b62e39 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -485,10 +485,12 @@ static void _decode_opc(DisasContext * ctx)
tcg_gen_movi_i32(cpu_sr_t, 1);
return;
case 0xfbfd: /* frchg */
+ CHECK_FPSCR_PR_0
tcg_gen_xori_i32(cpu_fpscr, cpu_fpscr, FPSCR_FR);
ctx->bstate = BS_STOP;
return;
case 0xf3fd: /* fschg */
+ CHECK_FPSCR_PR_0
tcg_gen_xori_i32(cpu_fpscr, cpu_fpscr, FPSCR_SZ);
ctx->bstate = BS_STOP;
return;
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 26/27] target/sh4: Implement fsrra
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (24 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 25/27] target/sh4: Add missing FPSCR.PR == 0 checks Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-07 22:27 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 27/27] target/sh4: Use tcg_gen_lookup_and_goto_ptr Richard Henderson
2017-07-18 7:51 ` [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Aurelien Jarno
27 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/helper.h | 1 +
target/sh4/op_helper.c | 16 ++++++++++++++++
target/sh4/translate.c | 2 ++
3 files changed, 19 insertions(+)
diff --git a/target/sh4/helper.h b/target/sh4/helper.h
index 6c6fa04..ea92dc0 100644
--- a/target/sh4/helper.h
+++ b/target/sh4/helper.h
@@ -37,6 +37,7 @@ DEF_HELPER_FLAGS_3(fsub_FT, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fsub_DT, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_2(fsqrt_FT, TCG_CALL_NO_WG, f32, env, f32)
DEF_HELPER_FLAGS_2(fsqrt_DT, TCG_CALL_NO_WG, f64, env, f64)
+DEF_HELPER_FLAGS_2(fsrra_FT, TCG_CALL_NO_WG, i32, env, i32)
DEF_HELPER_FLAGS_2(ftrc_FT, TCG_CALL_NO_WG, i32, env, f32)
DEF_HELPER_FLAGS_2(ftrc_DT, TCG_CALL_NO_WG, i32, env, f64)
DEF_HELPER_3(fipr, void, env, i32, i32)
diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index 8513f38..d798f23 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -406,6 +406,22 @@ float64 helper_fsqrt_DT(CPUSH4State *env, float64 t0)
return t0;
}
+float32 helper_fsrra_FT(CPUSH4State *env, float32 t0)
+{
+ set_float_exception_flags(0, &env->fp_status);
+ /* "Approximate" 1/sqrt(x) via actual computation. */
+ t0 = float32_sqrt(t0, &env->fp_status);
+ t0 = float32_div(float32_one, t0, &env->fp_status);
+ /* Since this is supposed to be an approximation, an imprecision
+ exception is required. One supposes this also follows the usual
+ IEEE rule that other exceptions take precidence. */
+ if (get_float_exception_flags(&env->fp_status) == 0) {
+ set_float_exception_flags(float_flag_inexact, &env->fp_status);
+ }
+ update_fpscr(env, GETPC());
+ return t0;
+}
+
float32 helper_fsub_FT(CPUSH4State *env, float32 t0, float32 t1)
{
set_float_exception_flags(0, &env->fp_status);
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 2b62e39..5fae872 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -1753,6 +1753,8 @@ static void _decode_opc(DisasContext * ctx)
return;
case 0xf07d: /* fsrra FRn */
CHECK_FPU_ENABLED
+ CHECK_FPSCR_PR_0
+ gen_helper_fsrra_FT(FREG(B11_8), cpu_env, FREG(B11_8));
break;
case 0xf08d: /* fldi0 FRn - FPSCR: R[PR] */
CHECK_FPU_ENABLED
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [Qemu-devel] [PATCH v2 27/27] target/sh4: Use tcg_gen_lookup_and_goto_ptr
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (25 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 26/27] target/sh4: Implement fsrra Richard Henderson
@ 2017-07-07 2:21 ` Richard Henderson
2017-07-18 7:51 ` [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Aurelien Jarno
27 siblings, 0 replies; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 2:21 UTC (permalink / raw)
To: qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target/sh4/translate.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 5fae872..7e80e10 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -235,12 +235,15 @@ static inline void gen_save_cpu_state(DisasContext *ctx, bool save_pc)
}
}
+static inline bool use_exit_tb(DisasContext *ctx)
+{
+ return (ctx->tbflags & GUSA_EXCLUSIVE) != 0;
+}
+
static inline bool use_goto_tb(DisasContext *ctx, target_ulong dest)
{
- if (unlikely(ctx->singlestep_enabled)) {
- return false;
- }
- if (ctx->tbflags & GUSA_EXCLUSIVE) {
+ /* Use a direct jump if in same page and singlestep not enabled */
+ if (unlikely(ctx->singlestep_enabled || use_exit_tb(ctx))) {
return false;
}
#ifndef CONFIG_USER_ONLY
@@ -253,28 +256,35 @@ static inline bool use_goto_tb(DisasContext *ctx, target_ulong dest)
static void gen_goto_tb(DisasContext *ctx, int n, target_ulong dest)
{
if (use_goto_tb(ctx, dest)) {
- /* Use a direct jump if in same page and singlestep not enabled */
tcg_gen_goto_tb(n);
tcg_gen_movi_i32(cpu_pc, dest);
tcg_gen_exit_tb((uintptr_t)ctx->tb + n);
} else {
tcg_gen_movi_i32(cpu_pc, dest);
- if (ctx->singlestep_enabled)
+ if (ctx->singlestep_enabled) {
gen_helper_debug(cpu_env);
- tcg_gen_exit_tb(0);
+ } else if (use_exit_tb(ctx)) {
+ tcg_gen_exit_tb(0);
+ } else {
+ tcg_gen_lookup_and_goto_ptr(cpu_pc);
+ }
}
}
static void gen_jump(DisasContext * ctx)
{
- if (ctx->delayed_pc == (uint32_t) - 1) {
+ if (ctx->delayed_pc == -1) {
/* Target is not statically known, it comes necessarily from a
delayed jump as immediate jump are conditinal jumps */
tcg_gen_mov_i32(cpu_pc, cpu_delayed_pc);
tcg_gen_discard_i32(cpu_delayed_pc);
- if (ctx->singlestep_enabled)
+ if (ctx->singlestep_enabled) {
gen_helper_debug(cpu_env);
- tcg_gen_exit_tb(0);
+ } else if (use_exit_tb(ctx)) {
+ tcg_gen_exit_tb(0);
+ } else {
+ tcg_gen_lookup_and_goto_ptr(cpu_pc);
+ }
} else {
gen_goto_tb(ctx, 0, ctx->delayed_pc);
}
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery Richard Henderson
@ 2017-07-07 7:25 ` John Paul Adrian Glaubitz
2017-07-07 8:20 ` Richard Henderson
2017-07-07 9:05 ` [Qemu-devel] [PATCH v2 08/27] " Laurent Vivier
2017-07-15 22:52 ` Aurelien Jarno
2 siblings, 1 reply; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-07 7:25 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
On 07/07/2017 04:20 AM, Richard Henderson wrote:
> We translate gUSA regions atomically in a parallel context.
> But in a serial context a gUSA region may be interrupted.
> In that case, restart the region as the kernel would.
This patch is still causing random segfaults, unfortunately:
Setting up dpkg (1.18.24+b1) ...
Setting up perl-base (5.24.1-5) ...
Setting up grep (2.27-2) ...
Setting up debconf (1.5.62) ...
Setting up tzdata (2017b-2) ...
Current default time zone: 'Etc/UTC'
Local time is now: Fri Jul 7 07:17:20 UTC 2017.
Universal Time is now: Fri Jul 7 07:17:20 UTC 2017.
Run 'dpkg-reconfigure tzdata' if you wish to change it.
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
dpkg: error processing package tzdata (--configure):
subprocess installed post-installation script was killed by signal (Segmentation fault)
Setting up gzip (1.6-5) ...
Setting up dash (0.5.8-2.4) ...
Setting up init-system-helpers (1.48) ...
Setting up libpam0g:sh4 (1.1.8-3.6) ...
Setting up libpam-modules-bin (1.1.8-3.6) ...
Setting up bash (4.4-5) ...
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 7:25 ` John Paul Adrian Glaubitz
@ 2017-07-07 8:20 ` Richard Henderson
2017-07-07 8:30 ` John Paul Adrian Glaubitz
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 8:20 UTC (permalink / raw)
To: John Paul Adrian Glaubitz, qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
On 07/06/2017 09:25 PM, John Paul Adrian Glaubitz wrote:
> On 07/07/2017 04:20 AM, Richard Henderson wrote:
>> We translate gUSA regions atomically in a parallel context.
>> But in a serial context a gUSA region may be interrupted.
>> In that case, restart the region as the kernel would.
>
> This patch is still causing random segfaults, unfortunately:
>
> Setting up dpkg (1.18.24+b1) ...
> Setting up perl-base (5.24.1-5) ...
> Setting up grep (2.27-2) ...
> Setting up debconf (1.5.62) ...
> Setting up tzdata (2017b-2) ...
>
> Current default time zone: 'Etc/UTC'
> Local time is now: Fri Jul 7 07:17:20 UTC 2017.
> Universal Time is now: Fri Jul 7 07:17:20 UTC 2017.
> Run 'dpkg-reconfigure tzdata' if you wish to change it.
>
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> dpkg: error processing package tzdata (--configure):
> subprocess installed post-installation script was killed by signal (Segmentation fault)
> Setting up gzip (1.6-5) ...
> Setting up dash (0.5.8-2.4) ...
> Setting up init-system-helpers (1.48) ...
> Setting up libpam0g:sh4 (1.1.8-3.6) ...
> Setting up libpam-modules-bin (1.1.8-3.6) ...
> Setting up bash (4.4-5) ...
How do I reproduce this from the filesystem image you linked earlier?
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 8:20 ` Richard Henderson
@ 2017-07-07 8:30 ` John Paul Adrian Glaubitz
2017-07-07 8:35 ` John Paul Adrian Glaubitz
0 siblings, 1 reply; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-07 8:30 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
On 07/07/2017 10:20 AM, Richard Henderson wrote:
> How do I reproduce this from the filesystem image you linked earlier?
So, the problem happens in the tzdata package when it's being configured:
(sid-sh4-sbuild)root@nofan:/# dpkg-reconfigure tzdata
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = "en_US.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76.)
debconf: falling back to frontend: Readline
Configuring tzdata
------------------
Please select the geographic area in which you live. Subsequent configuration questions will narrow this down by
presenting a list of cities, representing the time zones in which they are located.
1. Africa 3. Antarctica 5. Arctic 7. Atlantic 9. Indian 11. SystemV 13. Etc
2. America 4. Australia 6. Asia 8. Europe 10. Pacific 12. US
Geographic area: 8
Please select the city or region corresponding to your time zone.
1. Amsterdam 10. Bucharest 19. Isle_of_Man 28. Luxembourg 37. Paris 46. Simferopol 55. Vaduz
2. Andorra 11. Budapest 20. Istanbul 29. Madrid 38. Podgorica 47. Skopje 56. Vatican
3. Astrakhan 12. Busingen 21. Jersey 30. Malta 39. Prague 48. Sofia 57. Vienna
4. Athens 13. Chisinau 22. Kaliningrad 31. Mariehamn 40. Riga 49. Stockholm 58. Vilnius
5. Belfast 14. Copenhagen 23. Kiev 32. Minsk 41. Rome 50. Tallinn 59. Volgograd
6. Belgrade 15. Dublin 24. Kirov 33. Monaco 42. Samara 51. Tirane 60. Warsaw
7. Berlin 16. Gibraltar 25. Lisbon 34. Moscow 43. San_Marino 52. Tiraspol 61. Zagreb
8. Bratislava 17. Guernsey 26. Ljubljana 35. Nicosia 44. Sarajevo 53. Ulyanovsk 62. Zaporozhye
9. Brussels 18. Helsinki 27. London 36. Oslo 45. Saratov 54. Uzhgorod 63. Zurich
Time zone: 7
Current default time zone: 'Europe/Berlin'
Local time is now: Fri Jul 7 10:26:31 CEST 2017.
Universal Time is now: Fri Jul 7 08:26:31 UTC 2017.
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
(sid-sh4-sbuild)root@nofan:/#
The scripts which are run here can be found as /var/lib/dpkg/info/tzdata.{config,postinst). I don't know
yet which command in particular triggers the crash.
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 8:30 ` John Paul Adrian Glaubitz
@ 2017-07-07 8:35 ` John Paul Adrian Glaubitz
2017-07-07 16:22 ` Richard Henderson
[not found] ` <20170707163826.22631-1-rth@twiddle.net>
0 siblings, 2 replies; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-07 8:35 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
On 07/07/2017 10:30 AM, John Paul Adrian Glaubitz wrote:
> The scripts which are run here can be found as /var/lib/dpkg/info/tzdata.{config,postinst).
> I don't know yet which command in particular triggers the crash.
Interesting. It crashes for me immediately after resizing the terminal window with:
(sid-sh4-sbuild)root@nofan:/# Unhandled trap: 0x180
pc=0xf6ffe9fa sr=0x00000101 pr=0x004a73c2 fpscr=0x00080000
spc=0x00000000 ssr=0x00000000 gbr=0xf6646470 vbr=0x00000000
sgr=0x00000000 dbr=0x00000000 delayed_pc=0xf6ffea14 fpul=0x00000000
r0=0xfffffffc r1=0xf6ffea1c r2=0x00000000 r3=0x00000134
r4=0x00000001 r5=0xf6ffeac0 r6=0x00000000 r7=0x00000000
r8=0x00000000 r9=0x0041a050 r10=0xf677e4b8 r11=0xf6ffea40
r12=0xf677ec54 r13=0x00000000 r14=0x0041a0a4 r15=0xf6ffea1c
r16=0x00000000 r17=0x00000000 r18=0x00000000 r19=0x00000000
r20=0x00000000 r21=0x00000000 r22=0x00000000 r23=0x00000000
root@nofan:~>
I did not enter any commands. Just chrooting into the chroot, the resizing the
terminal window was enough.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery Richard Henderson
2017-07-07 7:25 ` John Paul Adrian Glaubitz
@ 2017-07-07 9:05 ` Laurent Vivier
2017-07-07 9:09 ` Laurent Vivier
2017-07-07 9:13 ` John Paul Adrian Glaubitz
2017-07-15 22:52 ` Aurelien Jarno
2 siblings, 2 replies; 89+ messages in thread
From: Laurent Vivier @ 2017-07-07 9:05 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: aurelien, bruno, glaubitz
Le 07/07/2017 à 04:20, Richard Henderson a écrit :
> We translate gUSA regions atomically in a parallel context.
> But in a serial context a gUSA region may be interrupted.
> In that case, restart the region as the kernel would.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> linux-user/signal.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/linux-user/signal.c b/linux-user/signal.c
> index 3d18d1b..a537778 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -3471,6 +3471,25 @@ static abi_ulong get_sigframe(struct target_sigaction *ka,
> return (sp - frame_size) & -8ul;
> }
>
> +/* Notice when we're in the middle of a gUSA region and reset.
> + Note that this will only occur for !parallel_cpus, as we will
> + translate such sequences differently in a parallel context. */
> +static void unwind_gusa(CPUSH4State *regs)
> +{
> + /* If the stack pointer is sufficiently negative ... */
> + if ((regs->gregs[15] & 0xc0000000u) == 0xc0000000u
> + /* ... and we haven't completed the sequence ... */
> + && regs->pc < regs->gregs[0]) {
> + /* Reset the PC to before the gUSA region, as computed from
> + R0 = region end, SP = -(region size), plus one more insn
> + that actually sets SP to the region size. */
> + regs->pc = regs->gregs[0] + regs->gregs[15] - 2;
> +
> + /* Reset the SP to the saved version in R1. */
> + regs->gregs[15] = regs->gregs[1];
> + }
> +}
> +
> static void setup_sigcontext(struct target_sigcontext *sc,
> CPUSH4State *regs, unsigned long mask)
> {
> @@ -3534,6 +3553,8 @@ static void setup_frame(int sig, struct target_sigaction *ka,
> abi_ulong frame_addr;
> int i;
>
> + unwind_gusa(regs);
> +
> frame_addr = get_sigframe(ka, regs->gregs[15], sizeof(*frame));
I think unwind_gusa() should be moved after the get_sigfram() (in both
cases), because r15 can be updated and the sigframe base lost.
@@ -3551,9 +3552,8 @@ static void setup_frame(int sig, struct
target_sigaction *
ka,
abi_ulong frame_addr;
int i;
- unwind_gusa(regs);
-
frame_addr = get_sigframe(ka, regs->gregs[15], sizeof(*frame));
+ unwind_gusa(regs);
trace_user_setup_frame(regs, frame_addr);
if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
goto give_sigsegv;
@@ -3602,9 +3602,8 @@ static void setup_rt_frame(int sig, struct
target_sigaction *ka,
abi_ulong frame_addr;
int i;
- unwind_gusa(regs);
-
frame_addr = get_sigframe(ka, regs->gregs[15], sizeof(*frame));
+ unwind_gusa(regs);
trace_user_setup_rt_frame(regs, frame_addr);
if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
goto give_sigsegv;
Laurent
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 9:05 ` [Qemu-devel] [PATCH v2 08/27] " Laurent Vivier
@ 2017-07-07 9:09 ` Laurent Vivier
2017-07-07 9:13 ` John Paul Adrian Glaubitz
1 sibling, 0 replies; 89+ messages in thread
From: Laurent Vivier @ 2017-07-07 9:09 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: aurelien, bruno, glaubitz
Le 07/07/2017 à 11:05, Laurent Vivier a écrit :
> Le 07/07/2017 à 04:20, Richard Henderson a écrit :
>> We translate gUSA regions atomically in a parallel context.
>> But in a serial context a gUSA region may be interrupted.
>> In that case, restart the region as the kernel would.
>>
>> Signed-off-by: Richard Henderson <rth@twiddle.net>
>> ---
>> linux-user/signal.c | 23 +++++++++++++++++++++++
>> 1 file changed, 23 insertions(+)
>>
>> diff --git a/linux-user/signal.c b/linux-user/signal.c
>> index 3d18d1b..a537778 100644
>> --- a/linux-user/signal.c
>> +++ b/linux-user/signal.c
>> @@ -3471,6 +3471,25 @@ static abi_ulong get_sigframe(struct target_sigaction *ka,
>> return (sp - frame_size) & -8ul;
>> }
>>
>> +/* Notice when we're in the middle of a gUSA region and reset.
>> + Note that this will only occur for !parallel_cpus, as we will
>> + translate such sequences differently in a parallel context. */
>> +static void unwind_gusa(CPUSH4State *regs)
>> +{
>> + /* If the stack pointer is sufficiently negative ... */
>> + if ((regs->gregs[15] & 0xc0000000u) == 0xc0000000u
>> + /* ... and we haven't completed the sequence ... */
>> + && regs->pc < regs->gregs[0]) {
>> + /* Reset the PC to before the gUSA region, as computed from
>> + R0 = region end, SP = -(region size), plus one more insn
>> + that actually sets SP to the region size. */
>> + regs->pc = regs->gregs[0] + regs->gregs[15] - 2;
>> +
>> + /* Reset the SP to the saved version in R1. */
>> + regs->gregs[15] = regs->gregs[1];
>> + }
>> +}
>> +
>> static void setup_sigcontext(struct target_sigcontext *sc,
>> CPUSH4State *regs, unsigned long mask)
>> {
>> @@ -3534,6 +3553,8 @@ static void setup_frame(int sig, struct target_sigaction *ka,
>> abi_ulong frame_addr;
>> int i;
>>
>> + unwind_gusa(regs);
>> +
>> frame_addr = get_sigframe(ka, regs->gregs[15], sizeof(*frame));
>
> I think unwind_gusa() should be moved after the get_sigfram() (in both
> cases), because r15 can be updated and the sigframe base lost.
>
No, it's stupid, r15 is negative
Laurent
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 9:05 ` [Qemu-devel] [PATCH v2 08/27] " Laurent Vivier
2017-07-07 9:09 ` Laurent Vivier
@ 2017-07-07 9:13 ` John Paul Adrian Glaubitz
1 sibling, 0 replies; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-07 9:13 UTC (permalink / raw)
To: Laurent Vivier, Richard Henderson, qemu-devel; +Cc: aurelien, bruno, glaubitz
On 07/07/2017 11:05 AM, Laurent Vivier wrote:> I think unwind_gusa() should be moved after the get_sigfram() (in both> cases), because r15 can be updated and
the sigframe base lost.
Tried the change. "dpkg-reconfigure tzdata" now crashes differently:
Current default time zone: 'Europe/Berlin'
Local time is now: Fri Jul 7 11:11:52 CEST 2017.
Universal Time is now: Fri Jul 7 09:11:52 UTC 2017.
Unhandled trap: 0x180
pc=0xf6fff39a sr=0x00000100 pr=0x00512c28 fpscr=0x00080004
spc=0x00000000 ssr=0x00000000 gbr=0xf6585470 vbr=0x00000000
sgr=0x00000000 dbr=0x00000000 delayed_pc=0xf6fff3ac fpul=0x00000002
r0=0xffffffe0 r1=0x00000470 r2=0x00afc048 r3=0x00000004
r4=0x0000000a r5=0x00bbdc18 r6=0x00000001 r7=0x00000001
r8=0x00bbdc18 r9=0x00bbdc18 r10=0x00000001 r11=0x0041728c
r12=0x0000000a r13=0x0041744c r14=0x0058a384 r15=0x00000470
r16=0x00000000 r17=0x00000000 r18=0x00000000 r19=0x00000000
r20=0x00000000 r21=0x00000000 r22=0x00000000 r23=0x00000000
(sid-sh4-sbuild)root@nofan:/#
The crash on terminal resize is still present, too.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 8:35 ` John Paul Adrian Glaubitz
@ 2017-07-07 16:22 ` Richard Henderson
2017-07-13 9:09 ` John Paul Adrian Glaubitz
[not found] ` <20170707163826.22631-1-rth@twiddle.net>
1 sibling, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 16:22 UTC (permalink / raw)
To: John Paul Adrian Glaubitz, qemu-devel; +Cc: aurelien, laurent, bruno, glaubitz
On 07/06/2017 10:35 PM, John Paul Adrian Glaubitz wrote:
> On 07/07/2017 10:30 AM, John Paul Adrian Glaubitz wrote:
>> The scripts which are run here can be found as /var/lib/dpkg/info/tzdata.{config,postinst).
>> I don't know yet which command in particular triggers the crash.
> Interesting. It crashes for me immediately after resizing the terminal window with:
>
> (sid-sh4-sbuild)root@nofan:/# Unhandled trap: 0x180
> pc=0xf6ffe9fa sr=0x00000101 pr=0x004a73c2 fpscr=0x00080000
> spc=0x00000000 ssr=0x00000000 gbr=0xf6646470 vbr=0x00000000
> sgr=0x00000000 dbr=0x00000000 delayed_pc=0xf6ffea14 fpul=0x00000000
> r0=0xfffffffc r1=0xf6ffea1c r2=0x00000000 r3=0x00000134
> r4=0x00000001 r5=0xf6ffeac0 r6=0x00000000 r7=0x00000000
> r8=0x00000000 r9=0x0041a050 r10=0xf677e4b8 r11=0xf6ffea40
> r12=0xf677ec54 r13=0x00000000 r14=0x0041a0a4 r15=0xf6ffea1c
> r16=0x00000000 r17=0x00000000 r18=0x00000000 r19=0x00000000
> r20=0x00000000 r21=0x00000000 r22=0x00000000 r23=0x00000000
> root@nofan:~>
>
> I did not enter any commands. Just chrooting into the chroot, the resizing the
> terminal window was enough.
Thanks for the hint. I've got it now.
The problem is that sh4-linux-user does not limit the page mappings in the same
way that the sh4 kernel does. So we begin the program with the stack mapped at
0xf7xxxxxx, which matches our normal check of 0xc0000000.
I think a more restricted check of -128 -- the most negative value that can be
placed by mov #imm,sp -- will work and is more appropriate. Indeed, I assume
that the only reason the kernel doesn't perform the check that was is having to
do it all in sh assembly with minimal free registers.
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* [Qemu-devel] Fwd: [PATCH v2.5] fixup! linux-user/sh4: Notice gUSA regions during signal delivery
[not found] ` <20170707163826.22631-1-rth@twiddle.net>
@ 2017-07-07 17:57 ` Richard Henderson
2017-07-07 19:00 ` Richard Henderson
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 17:57 UTC (permalink / raw)
To: qemu-devel
Bah. Should have gone to the list as well.
r~
---------- Forwarded message ----------
From: Richard Henderson <rth@twiddle.net>
Date: Fri, Jul 7, 2017 at 6:38 AM
Subject: [PATCH v2.5] fixup! linux-user/sh4: Notice gUSA regions during
signal delivery
To: glaubitz@physik.fu-berlin.de
Cc: laurent@vivier.eu
This fixes the signal delivery problem reported. I'll fold this into
the patch properly for v3, but this will allow reasonable testing to
proceed in the meantime.
r~
---
linux-user/signal.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 8c0b851..d68bd26 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -3476,13 +3476,18 @@ static abi_ulong get_sigframe(struct
target_sigaction *ka,
translate such sequences differently in a parallel context. */
static void unwind_gusa(CPUSH4State *regs)
{
- /* If the stack pointer is sufficiently negative ... */
- if ((regs->gregs[15] & 0xc0000000u) == 0xc0000000u
- /* ... and we haven't completed the sequence ... */
- && regs->pc < regs->gregs[0]) {
+ /* If the stack pointer is sufficiently negative, and we haven't
+ completed the sequence, then reset to the entry to the region. */
+ /* ??? The SH4 kernel checks for and address above 0xC0000000.
+ However, the page mappings in qemu linux-user aren't as restricted
+ and we wind up with the normal stack mapped above 0xF0000000.
+ That said, there is no reason why the kernel should be allowing
+ a gUSA region that spans 1GB. Use a tighter check here, for what
+ can actually be enabled by the immediate move. */
+ if (regs->gregs[15] >= -128u && regs->pc < regs->gregs[0]) {
/* Reset the PC to before the gUSA region, as computed from
- R0 = region end, SP = -(region size), plus one more insn
- that actually sets SP to the region size. */
+ R0 = region end, SP = -(region size), plus one more for the
+ insn that actually initializes SP to the region size. */
regs->pc = regs->gregs[0] + regs->gregs[15] - 2;
/* Reset the SP to the saved version in R1. */
--
2.9.4
^ permalink raw reply related [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] Fwd: [PATCH v2.5] fixup! linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 17:57 ` [Qemu-devel] Fwd: [PATCH v2.5] fixup! " Richard Henderson
@ 2017-07-07 19:00 ` Richard Henderson
2017-07-17 14:15 ` Aurelien Jarno
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-07 19:00 UTC (permalink / raw)
To: qemu-devel
On 07/07/2017 07:57 AM, Richard Henderson wrote:
> + /* ??? The SH4 kernel checks for and address above 0xC0000000.
> + However, the page mappings in qemu linux-user aren't as restricted
> + and we wind up with the normal stack mapped above 0xF0000000.
> + That said, there is no reason why the kernel should be allowing
> + a gUSA region that spans 1GB. Use a tighter check here, for what
> + can actually be enabled by the immediate move. */
Additionally, I can (and should) fix the address space problem for SH4 in
linux-user/main.c, where we have already done so for MIPS and Nios2.
See the initialization of reserved_va.
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 02/27] target/sh4: Consolidate end-of-TB tests
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 02/27] target/sh4: Consolidate end-of-TB tests Richard Henderson
@ 2017-07-07 21:42 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> We can fold 3 different tests within the decode loop
> into a more accurate computation of max_insns to start.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 82d4d69..663b5c0 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -1848,7 +1848,6 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> ctx.features = env->features;
> ctx.has_movcal = (ctx.tbflags & TB_FLAG_PENDING_MOVCA);
>
> - num_insns = 0;
> max_insns = tb->cflags & CF_COUNT_MASK;
> if (max_insns == 0) {
> max_insns = CF_COUNT_MASK;
> @@ -1856,9 +1855,23 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> if (max_insns > TCG_MAX_INSNS) {
> max_insns = TCG_MAX_INSNS;
> }
> + /* Since the ISA is fixed-width, we can bound by the number
> + of instructions remaining on the page. */
> + num_insns = (TARGET_PAGE_SIZE - (ctx.pc & (TARGET_PAGE_SIZE - 1))) / 2;
This could be written as num_insn = -(ctx.pc | TARGET_PAGE_MASK) / 2;
> + if (max_insns > num_insns) {
> + max_insns = num_insns;
> + }
You are following the existing pattern, so I can really blame you, but
maybe it's the moment to change the existing code into something like:
max_insns = MIN(max_insn, ...)
> + /* Single stepping means just that. */
> + if (ctx.singlestep_enabled || singlestep) {
> + max_insns = 1;
> + }
>
> gen_tb_start(tb);
> - while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
> + num_insns = 0;
> +
> + while (ctx.bstate == BS_NONE
> + && num_insns < max_insns
> + && !tcg_op_buf_full()) {
> tcg_gen_insn_start(ctx.pc, ctx.envflags);
> num_insns++;
>
> @@ -1882,18 +1895,10 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> ctx.opcode = cpu_lduw_code(env, ctx.pc);
> decode_opc(&ctx);
> ctx.pc += 2;
> - if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0)
> - break;
> - if (cs->singlestep_enabled) {
> - break;
> - }
> - if (num_insns >= max_insns)
> - break;
> - if (singlestep)
> - break;
> }
> - if (tb->cflags & CF_LAST_IO)
> + if (tb->cflags & CF_LAST_IO) {
> gen_io_end();
> + }
> if (cs->singlestep_enabled) {
> gen_save_cpu_state(&ctx, true);
> gen_helper_debug(cpu_env);
Besides the minor nitpicks above:
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK Richard Henderson
@ 2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:29 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> We'll be putting more things into this bitmask soon.
> Let's have a name that covers all possible uses.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/cpu.h | 4 +++-
> target/sh4/translate.c | 4 ++--
> 2 files changed, 5 insertions(+), 3 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean Richard Henderson
@ 2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:31 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> If we mask off any out-of-band bits before we assign to the
> variable, then we don't need to clean it up when reading.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/cpu.h | 2 +-
> target/sh4/cpu.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA Richard Henderson
@ 2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:31 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> Don't leave an unused bit after DELAY_SLOT_MASK.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/cpu.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 10/27] target/sh4: Hoist register bank selection
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 10/27] target/sh4: Hoist register bank selection Richard Henderson
@ 2017-07-07 21:48 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:48 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> Compute which register bank to use once at the start of translation.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG Richard Henderson
@ 2017-07-07 21:54 ` Aurelien Jarno
2017-07-08 16:54 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:54 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> We were treating FREG as an index and REG as a TCGv.
> Making FREG return a TCGv is both less confusing and
> a step toward cleaner banking of cpu_fregs.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 125 ++++++++++++++++++++-----------------------------
> 1 file changed, 52 insertions(+), 73 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines Richard Henderson
@ 2017-07-07 21:55 ` Aurelien Jarno
2017-07-08 16:56 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:55 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 13/27] target/sh4: Hoist fp register bank selection
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 13/27] target/sh4: Hoist fp register bank selection Richard Henderson
@ 2017-07-07 21:57 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:57 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> Compute which register bank to use once at the start of translation.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 14/27] target/sh4: Eliminate unused XREG macro
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 14/27] target/sh4: Eliminate unused XREG macro Richard Henderson
@ 2017-07-07 21:59 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 21:59 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 1 -
> 1 file changed, 1 deletion(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 15/27] target/sh4: Merge DREG into fpr64 routines
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 15/27] target/sh4: Merge DREG into fpr64 routines Richard Henderson
@ 2017-07-07 22:06 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:06 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> Also add a debugging assert that we did signal illegal opc
> for odd double-precision registers.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 26 +++++++++++++++-----------
> 1 file changed, 15 insertions(+), 11 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 16/27] target/sh4: Load/store Dr as 64-bit quantities
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 16/27] target/sh4: Load/store Dr as 64-bit quantities Richard Henderson
@ 2017-07-07 22:14 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:14 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> This enforces proper alignment and makes the register update
> more natural. Note that there is a more serious bug fix for
> fmov {DX}Rn,@(R0,Rn) to use a store instead of a load.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 74 ++++++++++++++++++++++++--------------------------
> 1 file changed, 35 insertions(+), 39 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 616e615..fcdabe8 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -1044,18 +1038,20 @@ static void _decode_opc(DisasContext * ctx)
> return;
> case 0xf00b: /* fmov {F,D,X}Rm,@-Rn - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> - TCGv addr = tcg_temp_new_i32();
> - tcg_gen_subi_i32(addr, REG(B11_8), 4);
> - if (ctx->tbflags & FPSCR_SZ) {
> - int fr = XHACK(B7_4);
> - tcg_gen_qemu_st_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
> - tcg_gen_subi_i32(addr, addr, 4);
> - tcg_gen_qemu_st_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
> - } else {
> - tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
> - }
> - tcg_gen_mov_i32(REG(B11_8), addr);
> - tcg_temp_free(addr);
> + {
> + TCGv addr = tcg_temp_new_i32();
> + if (ctx->tbflags & FPSCR_SZ) {
> + TCGv_i64 fp = tcg_temp_new_i64();
> + gen_load_fpr64(ctx, fp, XHACK(B7_4));
> + tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEQ);
addr is used without before being written. The following line is mising
before the load:
tcg_gen_subi_i32(addr, REG(B11_8), 8);
> + tcg_temp_free_i64(fp);
> + } else {
> + tcg_gen_subi_i32(addr, REG(B11_8), 4);
> + tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
> + }
> + tcg_gen_mov_i32(REG(B11_8), addr);
> + tcg_temp_free(addr);
> + }
> return;
> case 0xf006: /* fmov @(R0,Rm),{F,D,X}Rm - FPSCR: Nothing */
> CHECK_FPU_ENABLED
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 17/27] target/sh4: Simplify 64-bit fp reg-reg move
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 17/27] target/sh4: Simplify 64-bit fp reg-reg move Richard Henderson
@ 2017-07-07 22:15 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:15 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> We do not need to form full 64-bit quantities in order to perform
> the move. This reduces code expansion on 64-bit hosts.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT Richard Henderson
@ 2017-07-07 22:17 ` Aurelien Jarno
2017-07-08 16:59 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:17 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> We do not need to emit N copies of raising an exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED Richard Henderson
@ 2017-07-07 22:17 ` Aurelien Jarno
2017-07-08 17:00 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:17 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> We do not need to emit N copies of raising an exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED Richard Henderson
@ 2017-07-07 22:18 ` Aurelien Jarno
2017-07-08 17:01 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:18 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> We do not need to emit N copies of raising an exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks Richard Henderson
@ 2017-07-07 22:18 ` Aurelien Jarno
2017-07-08 17:02 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:18 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> Now that we have a do_illegal label, use goto in order
> to self-document the forcing of the exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 22/27] target/sh4: Introduce CHECK_FPSCR_PR_*
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 22/27] target/sh4: Introduce CHECK_FPSCR_PR_* Richard Henderson
@ 2017-07-07 22:20 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:20 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 57 +++++++++++++++++++++++++++-----------------------
> 1 file changed, 31 insertions(+), 26 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 23/27] target/sh4: Introduce CHECK_SH4A
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 23/27] target/sh4: Introduce CHECK_SH4A Richard Henderson
@ 2017-07-07 22:21 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:21 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 64 +++++++++++++++++++++++---------------------------
> 1 file changed, 29 insertions(+), 35 deletions(-)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 24/27] target/sh4: Implement fpchg
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 24/27] target/sh4: Implement fpchg Richard Henderson
@ 2017-07-07 22:23 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:23 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 5 +++++
> 1 file changed, 5 insertions(+)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 25/27] target/sh4: Add missing FPSCR.PR == 0 checks
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 25/27] target/sh4: Add missing FPSCR.PR == 0 checks Richard Henderson
@ 2017-07-07 22:24 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:24 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> Both frchg and fschg require PR == 0, otherwise undefined_operation.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 2 ++
> 1 file changed, 2 insertions(+)
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 26/27] target/sh4: Implement fsrra
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 26/27] target/sh4: Implement fsrra Richard Henderson
@ 2017-07-07 22:27 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-07 22:27 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:21, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/helper.h | 1 +
> target/sh4/op_helper.c | 16 ++++++++++++++++
> target/sh4/translate.c | 2 ++
> 3 files changed, 19 insertions(+)
>
> diff --git a/target/sh4/helper.h b/target/sh4/helper.h
> index 6c6fa04..ea92dc0 100644
> --- a/target/sh4/helper.h
> +++ b/target/sh4/helper.h
> @@ -37,6 +37,7 @@ DEF_HELPER_FLAGS_3(fsub_FT, TCG_CALL_NO_WG, f32, env, f32, f32)
> DEF_HELPER_FLAGS_3(fsub_DT, TCG_CALL_NO_WG, f64, env, f64, f64)
> DEF_HELPER_FLAGS_2(fsqrt_FT, TCG_CALL_NO_WG, f32, env, f32)
> DEF_HELPER_FLAGS_2(fsqrt_DT, TCG_CALL_NO_WG, f64, env, f64)
> +DEF_HELPER_FLAGS_2(fsrra_FT, TCG_CALL_NO_WG, i32, env, i32)
That should be f32 instead of i32
> DEF_HELPER_FLAGS_2(ftrc_FT, TCG_CALL_NO_WG, i32, env, f32)
> DEF_HELPER_FLAGS_2(ftrc_DT, TCG_CALL_NO_WG, i32, env, f64)
> DEF_HELPER_3(fipr, void, env, i32, i32)
> diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
> index 8513f38..d798f23 100644
> --- a/target/sh4/op_helper.c
> +++ b/target/sh4/op_helper.c
> @@ -406,6 +406,22 @@ float64 helper_fsqrt_DT(CPUSH4State *env, float64 t0)
> return t0;
> }
>
> +float32 helper_fsrra_FT(CPUSH4State *env, float32 t0)
> +{
> + set_float_exception_flags(0, &env->fp_status);
> + /* "Approximate" 1/sqrt(x) via actual computation. */
> + t0 = float32_sqrt(t0, &env->fp_status);
> + t0 = float32_div(float32_one, t0, &env->fp_status);
> + /* Since this is supposed to be an approximation, an imprecision
> + exception is required. One supposes this also follows the usual
> + IEEE rule that other exceptions take precidence. */
> + if (get_float_exception_flags(&env->fp_status) == 0) {
> + set_float_exception_flags(float_flag_inexact, &env->fp_status);
> + }
> + update_fpscr(env, GETPC());
> + return t0;
> +}
> +
> float32 helper_fsub_FT(CPUSH4State *env, float32 t0, float32 t1)
> {
> set_float_exception_flags(0, &env->fp_status);
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 2b62e39..5fae872 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -1753,6 +1753,8 @@ static void _decode_opc(DisasContext * ctx)
> return;
> case 0xf07d: /* fsrra FRn */
> CHECK_FPU_ENABLED
> + CHECK_FPSCR_PR_0
> + gen_helper_fsrra_FT(FREG(B11_8), cpu_env, FREG(B11_8));
> break;
> case 0xf08d: /* fldi0 FRn - FPSCR: R[PR] */
> CHECK_FPU_ENABLED
Otherwise it looks fine.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
@ 2017-07-08 16:29 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 16:29 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:20 PM, Richard Henderson wrote:
> We'll be putting more things into this bitmask soon.
> Let's have a name that covers all possible uses.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/cpu.h | 4 +++-
> target/sh4/translate.c | 4 ++--
> 2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
> index b15116e..240ed36 100644
> --- a/target/sh4/cpu.h
> +++ b/target/sh4/cpu.h
> @@ -96,6 +96,8 @@
> #define DELAY_SLOT_CONDITIONAL (1 << 1)
> #define DELAY_SLOT_RTE (1 << 2)
>
> +#define TB_FLAG_ENVFLAGS_MASK DELAY_SLOT_MASK
> +
> typedef struct tlb_t {
> uint32_t vpn; /* virtual page number */
> uint32_t ppn; /* physical page number */
> @@ -389,7 +391,7 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
> {
> *pc = env->pc;
> *cs_base = 0;
> - *flags = (env->flags & DELAY_SLOT_MASK) /* Bits 0- 2 */
> + *flags = (env->flags & TB_FLAG_ENVFLAGS_MASK) /* Bits 0-2 */
> | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
> | (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
> | (env->sr & (1u << SR_FD)) /* Bit 15 */
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 663b5c0..cf53cd6 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -225,7 +225,7 @@ static inline void gen_save_cpu_state(DisasContext *ctx, bool save_pc)
> if (ctx->delayed_pc != (uint32_t) -1) {
> tcg_gen_movi_i32(cpu_delayed_pc, ctx->delayed_pc);
> }
> - if ((ctx->tbflags & DELAY_SLOT_MASK) != ctx->envflags) {
> + if ((ctx->tbflags & TB_FLAG_ENVFLAGS_MASK) != ctx->envflags) {
> tcg_gen_movi_i32(cpu_flags, ctx->envflags);
> }
> }
> @@ -1837,7 +1837,7 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> pc_start = tb->pc;
> ctx.pc = pc_start;
> ctx.tbflags = (uint32_t)tb->flags;
> - ctx.envflags = tb->flags & DELAY_SLOT_MASK;
> + ctx.envflags = tb->flags & TB_FLAG_ENVFLAGS_MASK;
> ctx.bstate = BS_NONE;
> ctx.memidx = (ctx.tbflags & (1u << SR_MD)) == 0 ? 1 : 0;
> /* We don't know if the delayed pc came from a dynamic or static branch,
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
@ 2017-07-08 16:31 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 16:31 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:20 PM, Richard Henderson wrote:
> If we mask off any out-of-band bits before we assign to the
> variable, then we don't need to clean it up when reading.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/cpu.h | 2 +-
> target/sh4/cpu.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
> index 240ed36..6d179a7 100644
> --- a/target/sh4/cpu.h
> +++ b/target/sh4/cpu.h
> @@ -391,7 +391,7 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
> {
> *pc = env->pc;
> *cs_base = 0;
> - *flags = (env->flags & TB_FLAG_ENVFLAGS_MASK) /* Bits 0-2 */
> + *flags = env->flags /* Bits 0-2 */
> | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
> | (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
> | (env->sr & (1u << SR_FD)) /* Bit 15 */
> diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
> index 9da7e1e..8536f6d 100644
> --- a/target/sh4/cpu.c
> +++ b/target/sh4/cpu.c
> @@ -39,7 +39,7 @@ static void superh_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
> SuperHCPU *cpu = SUPERH_CPU(cs);
>
> cpu->env.pc = tb->pc;
> - cpu->env.flags = tb->flags;
> + cpu->env.flags = tb->flags & TB_FLAG_ENVFLAGS_MASK;
> }
>
> static bool superh_cpu_has_work(CPUState *cs)
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
@ 2017-07-08 16:31 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 16:31 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:20 PM, Richard Henderson wrote:
> Don't leave an unused bit after DELAY_SLOT_MASK.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/cpu.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
> index 6d179a7..da31805 100644
> --- a/target/sh4/cpu.h
> +++ b/target/sh4/cpu.h
> @@ -96,6 +96,8 @@
> #define DELAY_SLOT_CONDITIONAL (1 << 1)
> #define DELAY_SLOT_RTE (1 << 2)
>
> +#define TB_FLAG_PENDING_MOVCA (1 << 3)
> +
> #define TB_FLAG_ENVFLAGS_MASK DELAY_SLOT_MASK
>
> typedef struct tlb_t {
> @@ -369,8 +371,6 @@ static inline int cpu_ptel_pr (uint32_t ptel)
> #define PTEA_TC (1 << 3)
> #define cpu_ptea_tc(ptea) (((ptea) & PTEA_TC) >> 3)
>
> -#define TB_FLAG_PENDING_MOVCA (1 << 4)
> -
> static inline target_ulong cpu_read_sr(CPUSH4State *env)
> {
> return env->sr | (env->sr_m << SR_M) |
> @@ -395,7 +395,7 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
> | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
> | (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
> | (env->sr & (1u << SR_FD)) /* Bit 15 */
> - | (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 4 */
> + | (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 3 */
> }
>
> #endif /* SH4_CPU_H */
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG Richard Henderson
2017-07-07 21:54 ` Aurelien Jarno
@ 2017-07-08 16:54 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 16:54 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:20 PM, Richard Henderson wrote:
> We were treating FREG as an index and REG as a TCGv.
> Making FREG return a TCGv is both less confusing and
> a step toward cleaner banking of cpu_fregs.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
still:
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/translate.c | 125 ++++++++++++++++++++-----------------------------
> 1 file changed, 52 insertions(+), 73 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 0ac101e..b521cff 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -366,10 +366,11 @@ static inline void gen_store_fpr64 (TCGv_i64 t, int reg)
> #define REG(x) cpu_gregs[(x) ^ ctx->gbank]
> #define ALTREG(x) cpu_gregs[(x) ^ ctx->gbank ^ 0x10]
>
> -#define FREG(x) (ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x))
> +#define FREG(x) cpu_fregs[ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x)]
> #define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
> -#define XREG(x) (ctx->tbflags & FPSCR_FR ? XHACK(x) ^ 0x10 : XHACK(x))
> -#define DREG(x) FREG(x) /* Assumes lsb of (x) is always 0 */
> +#define XREG(x) FREG(XHACK(x))
> +/* Assumes lsb of (x) is always 0 */
> +#define DREG(x) (ctx->tbflags & FPSCR_FR ? (x) ^ 0x10 : (x))
>
> #define CHECK_NOT_DELAY_SLOT \
> if (ctx->envflags & DELAY_SLOT_MASK) { \
> @@ -989,56 +990,51 @@ static void _decode_opc(DisasContext * ctx)
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_SZ) {
> TCGv_i64 fp = tcg_temp_new_i64();
> - gen_load_fpr64(fp, XREG(B7_4));
> - gen_store_fpr64(fp, XREG(B11_8));
> + gen_load_fpr64(fp, XHACK(B7_4));
> + gen_store_fpr64(fp, XHACK(B11_8));
> tcg_temp_free_i64(fp);
> } else {
> - tcg_gen_mov_i32(cpu_fregs[FREG(B11_8)], cpu_fregs[FREG(B7_4)]);
> + tcg_gen_mov_i32(FREG(B11_8), FREG(B7_4));
> }
> return;
> case 0xf00a: /* fmov {F,D,X}Rm,@Rn - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_SZ) {
> TCGv addr_hi = tcg_temp_new();
> - int fr = XREG(B7_4);
> + int fr = XHACK(B7_4);
> tcg_gen_addi_i32(addr_hi, REG(B11_8), 4);
> - tcg_gen_qemu_st_i32(cpu_fregs[fr], REG(B11_8),
> - ctx->memidx, MO_TEUL);
> - tcg_gen_qemu_st_i32(cpu_fregs[fr+1], addr_hi,
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_st_i32(FREG(fr), REG(B11_8), ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_st_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
> tcg_temp_free(addr_hi);
> } else {
> - tcg_gen_qemu_st_i32(cpu_fregs[FREG(B7_4)], REG(B11_8),
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx, MO_TEUL);
> }
> return;
> case 0xf008: /* fmov @Rm,{F,D,X}Rn - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_SZ) {
> TCGv addr_hi = tcg_temp_new();
> - int fr = XREG(B11_8);
> + int fr = XHACK(B11_8);
> tcg_gen_addi_i32(addr_hi, REG(B7_4), 4);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr], REG(B7_4), ctx->memidx, MO_TEUL);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr_hi, ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(fr), REG(B7_4), ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
> tcg_temp_free(addr_hi);
> } else {
> - tcg_gen_qemu_ld_i32(cpu_fregs[FREG(B11_8)], REG(B7_4),
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
> }
> return;
> case 0xf009: /* fmov @Rm+,{F,D,X}Rn - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_SZ) {
> TCGv addr_hi = tcg_temp_new();
> - int fr = XREG(B11_8);
> + int fr = XHACK(B11_8);
> tcg_gen_addi_i32(addr_hi, REG(B7_4), 4);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr], REG(B7_4), ctx->memidx, MO_TEUL);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr_hi, ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(fr), REG(B7_4), ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(fr + 1), addr_hi, ctx->memidx, MO_TEUL);
> tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 8);
> tcg_temp_free(addr_hi);
> } else {
> - tcg_gen_qemu_ld_i32(cpu_fregs[FREG(B11_8)], REG(B7_4),
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
> tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
> }
> return;
> @@ -1047,13 +1043,12 @@ static void _decode_opc(DisasContext * ctx)
> TCGv addr = tcg_temp_new_i32();
> tcg_gen_subi_i32(addr, REG(B11_8), 4);
> if (ctx->tbflags & FPSCR_SZ) {
> - int fr = XREG(B7_4);
> - tcg_gen_qemu_st_i32(cpu_fregs[fr+1], addr, ctx->memidx, MO_TEUL);
> + int fr = XHACK(B7_4);
> + tcg_gen_qemu_st_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
> tcg_gen_subi_i32(addr, addr, 4);
> - tcg_gen_qemu_st_i32(cpu_fregs[fr], addr, ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_st_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
> } else {
> - tcg_gen_qemu_st_i32(cpu_fregs[FREG(B7_4)], addr,
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
> }
> tcg_gen_mov_i32(REG(B11_8), addr);
> tcg_temp_free(addr);
> @@ -1064,15 +1059,12 @@ static void _decode_opc(DisasContext * ctx)
> TCGv addr = tcg_temp_new_i32();
> tcg_gen_add_i32(addr, REG(B7_4), REG(0));
> if (ctx->tbflags & FPSCR_SZ) {
> - int fr = XREG(B11_8);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr], addr,
> - ctx->memidx, MO_TEUL);
> + int fr = XHACK(B11_8);
> + tcg_gen_qemu_ld_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
> tcg_gen_addi_i32(addr, addr, 4);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr,
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
> } else {
> - tcg_gen_qemu_ld_i32(cpu_fregs[FREG(B11_8)], addr,
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx, MO_TEUL);
> }
> tcg_temp_free(addr);
> }
> @@ -1083,15 +1075,12 @@ static void _decode_opc(DisasContext * ctx)
> TCGv addr = tcg_temp_new();
> tcg_gen_add_i32(addr, REG(B11_8), REG(0));
> if (ctx->tbflags & FPSCR_SZ) {
> - int fr = XREG(B7_4);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr], addr,
> - ctx->memidx, MO_TEUL);
> + int fr = XHACK(B7_4);
> + tcg_gen_qemu_ld_i32(FREG(fr), addr, ctx->memidx, MO_TEUL);
> tcg_gen_addi_i32(addr, addr, 4);
> - tcg_gen_qemu_ld_i32(cpu_fregs[fr+1], addr,
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_ld_i32(FREG(fr + 1), addr, ctx->memidx, MO_TEUL);
> } else {
> - tcg_gen_qemu_st_i32(cpu_fregs[FREG(B7_4)], addr,
> - ctx->memidx, MO_TEUL);
> + tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
> }
> tcg_temp_free(addr);
> }
> @@ -1139,34 +1128,28 @@ static void _decode_opc(DisasContext * ctx)
> } else {
> switch (ctx->opcode & 0xf00f) {
> case 0xf000: /* fadd Rm,Rn */
> - gen_helper_fadd_FT(cpu_fregs[FREG(B11_8)], cpu_env,
> - cpu_fregs[FREG(B11_8)],
> - cpu_fregs[FREG(B7_4)]);
> + gen_helper_fadd_FT(FREG(B11_8), cpu_env,
> + FREG(B11_8), FREG(B7_4));
> break;
> case 0xf001: /* fsub Rm,Rn */
> - gen_helper_fsub_FT(cpu_fregs[FREG(B11_8)], cpu_env,
> - cpu_fregs[FREG(B11_8)],
> - cpu_fregs[FREG(B7_4)]);
> + gen_helper_fsub_FT(FREG(B11_8), cpu_env,
> + FREG(B11_8), FREG(B7_4));
> break;
> case 0xf002: /* fmul Rm,Rn */
> - gen_helper_fmul_FT(cpu_fregs[FREG(B11_8)], cpu_env,
> - cpu_fregs[FREG(B11_8)],
> - cpu_fregs[FREG(B7_4)]);
> + gen_helper_fmul_FT(FREG(B11_8), cpu_env,
> + FREG(B11_8), FREG(B7_4));
> break;
> case 0xf003: /* fdiv Rm,Rn */
> - gen_helper_fdiv_FT(cpu_fregs[FREG(B11_8)], cpu_env,
> - cpu_fregs[FREG(B11_8)],
> - cpu_fregs[FREG(B7_4)]);
> + gen_helper_fdiv_FT(FREG(B11_8), cpu_env,
> + FREG(B11_8), FREG(B7_4));
> break;
> case 0xf004: /* fcmp/eq Rm,Rn */
> gen_helper_fcmp_eq_FT(cpu_sr_t, cpu_env,
> - cpu_fregs[FREG(B11_8)],
> - cpu_fregs[FREG(B7_4)]);
> + FREG(B11_8), FREG(B7_4));
> return;
> case 0xf005: /* fcmp/gt Rm,Rn */
> gen_helper_fcmp_gt_FT(cpu_sr_t, cpu_env,
> - cpu_fregs[FREG(B11_8)],
> - cpu_fregs[FREG(B7_4)]);
> + FREG(B11_8), FREG(B7_4));
> return;
> }
> }
> @@ -1178,9 +1161,8 @@ static void _decode_opc(DisasContext * ctx)
> if (ctx->tbflags & FPSCR_PR) {
> break; /* illegal instruction */
> } else {
> - gen_helper_fmac_FT(cpu_fregs[FREG(B11_8)], cpu_env,
> - cpu_fregs[FREG(0)], cpu_fregs[FREG(B7_4)],
> - cpu_fregs[FREG(B11_8)]);
> + gen_helper_fmac_FT(FREG(B11_8), cpu_env,
> + FREG(0), FREG(B7_4), FREG(B11_8));
> return;
> }
> }
> @@ -1718,11 +1700,11 @@ static void _decode_opc(DisasContext * ctx)
> return;
> case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> - tcg_gen_mov_i32(cpu_fregs[FREG(B11_8)], cpu_fpul);
> + tcg_gen_mov_i32(FREG(B11_8), cpu_fpul);
> return;
> case 0xf01d: /* flds FRm,FPUL - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> - tcg_gen_mov_i32(cpu_fpul, cpu_fregs[FREG(B11_8)]);
> + tcg_gen_mov_i32(cpu_fpul, FREG(B11_8));
> return;
> case 0xf02d: /* float FPUL,FRn/DRn - FPSCR: R[PR,Enable.I]/W[Cause,Flag] */
> CHECK_FPU_ENABLED
> @@ -1736,7 +1718,7 @@ static void _decode_opc(DisasContext * ctx)
> tcg_temp_free_i64(fp);
> }
> else {
> - gen_helper_float_FT(cpu_fregs[FREG(B11_8)], cpu_env, cpu_fpul);
> + gen_helper_float_FT(FREG(B11_8), cpu_env, cpu_fpul);
> }
> return;
> case 0xf03d: /* ftrc FRm/DRm,FPUL - FPSCR: R[PR,Enable.V]/W[Cause,Flag] */
> @@ -1751,18 +1733,16 @@ static void _decode_opc(DisasContext * ctx)
> tcg_temp_free_i64(fp);
> }
> else {
> - gen_helper_ftrc_FT(cpu_fpul, cpu_env, cpu_fregs[FREG(B11_8)]);
> + gen_helper_ftrc_FT(cpu_fpul, cpu_env, FREG(B11_8));
> }
> return;
> case 0xf04d: /* fneg FRn/DRn - FPSCR: Nothing */
> CHECK_FPU_ENABLED
> - tcg_gen_xori_i32(cpu_fregs[FREG(B11_8)], cpu_fregs[FREG(B11_8)],
> - 0x80000000);
> + tcg_gen_xori_i32(FREG(B11_8), FREG(B11_8), 0x80000000);
> return;
> case 0xf05d: /* fabs FRn/DRn - FPCSR: Nothing */
> CHECK_FPU_ENABLED
> - tcg_gen_andi_i32(cpu_fregs[FREG(B11_8)], cpu_fregs[FREG(B11_8)],
> - 0x7fffffff);
> + tcg_gen_andi_i32(FREG(B11_8), FREG(B11_8), 0x7fffffff);
> return;
> case 0xf06d: /* fsqrt FRn */
> CHECK_FPU_ENABLED
> @@ -1775,8 +1755,7 @@ static void _decode_opc(DisasContext * ctx)
> gen_store_fpr64(fp, DREG(B11_8));
> tcg_temp_free_i64(fp);
> } else {
> - gen_helper_fsqrt_FT(cpu_fregs[FREG(B11_8)], cpu_env,
> - cpu_fregs[FREG(B11_8)]);
> + gen_helper_fsqrt_FT(FREG(B11_8), cpu_env, FREG(B11_8));
> }
> return;
> case 0xf07d: /* fsrra FRn */
> @@ -1785,13 +1764,13 @@ static void _decode_opc(DisasContext * ctx)
> case 0xf08d: /* fldi0 FRn - FPSCR: R[PR] */
> CHECK_FPU_ENABLED
> if (!(ctx->tbflags & FPSCR_PR)) {
> - tcg_gen_movi_i32(cpu_fregs[FREG(B11_8)], 0);
> + tcg_gen_movi_i32(FREG(B11_8), 0);
> }
> return;
> case 0xf09d: /* fldi1 FRn - FPSCR: R[PR] */
> CHECK_FPU_ENABLED
> if (!(ctx->tbflags & FPSCR_PR)) {
> - tcg_gen_movi_i32(cpu_fregs[FREG(B11_8)], 0x3f800000);
> + tcg_gen_movi_i32(FREG(B11_8), 0x3f800000);
> }
> return;
> case 0xf0ad: /* fcnvsd FPUL,DRn */
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines Richard Henderson
2017-07-07 21:55 ` Aurelien Jarno
@ 2017-07-08 16:56 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 16:56 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:20 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/translate.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index b521cff..878c0bd 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -343,12 +343,12 @@ static void gen_delayed_conditional_jump(DisasContext * ctx)
> gen_jump(ctx);
> }
>
> -static inline void gen_load_fpr64(TCGv_i64 t, int reg)
> +static inline void gen_load_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
> {
> tcg_gen_concat_i32_i64(t, cpu_fregs[reg + 1], cpu_fregs[reg]);
> }
>
> -static inline void gen_store_fpr64 (TCGv_i64 t, int reg)
> +static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
> {
> tcg_gen_extr_i64_i32(cpu_fregs[reg + 1], cpu_fregs[reg], t);
> }
> @@ -990,8 +990,8 @@ static void _decode_opc(DisasContext * ctx)
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_SZ) {
> TCGv_i64 fp = tcg_temp_new_i64();
> - gen_load_fpr64(fp, XHACK(B7_4));
> - gen_store_fpr64(fp, XHACK(B11_8));
> + gen_load_fpr64(ctx, fp, XHACK(B7_4));
> + gen_store_fpr64(ctx, fp, XHACK(B11_8));
> tcg_temp_free_i64(fp);
> } else {
> tcg_gen_mov_i32(FREG(B11_8), FREG(B7_4));
> @@ -1100,8 +1100,8 @@ static void _decode_opc(DisasContext * ctx)
> break; /* illegal instruction */
> fp0 = tcg_temp_new_i64();
> fp1 = tcg_temp_new_i64();
> - gen_load_fpr64(fp0, DREG(B11_8));
> - gen_load_fpr64(fp1, DREG(B7_4));
> + gen_load_fpr64(ctx, fp0, DREG(B11_8));
> + gen_load_fpr64(ctx, fp1, DREG(B7_4));
> switch (ctx->opcode & 0xf00f) {
> case 0xf000: /* fadd Rm,Rn */
> gen_helper_fadd_DT(fp0, cpu_env, fp0, fp1);
> @@ -1122,7 +1122,7 @@ static void _decode_opc(DisasContext * ctx)
> gen_helper_fcmp_gt_DT(cpu_sr_t, cpu_env, fp0, fp1);
> return;
> }
> - gen_store_fpr64(fp0, DREG(B11_8));
> + gen_store_fpr64(ctx, fp0, DREG(B11_8));
> tcg_temp_free_i64(fp0);
> tcg_temp_free_i64(fp1);
> } else {
> @@ -1714,7 +1714,7 @@ static void _decode_opc(DisasContext * ctx)
> break; /* illegal instruction */
> fp = tcg_temp_new_i64();
> gen_helper_float_DT(fp, cpu_env, cpu_fpul);
> - gen_store_fpr64(fp, DREG(B11_8));
> + gen_store_fpr64(ctx, fp, DREG(B11_8));
> tcg_temp_free_i64(fp);
> }
> else {
> @@ -1728,7 +1728,7 @@ static void _decode_opc(DisasContext * ctx)
> if (ctx->opcode & 0x0100)
> break; /* illegal instruction */
> fp = tcg_temp_new_i64();
> - gen_load_fpr64(fp, DREG(B11_8));
> + gen_load_fpr64(ctx, fp, DREG(B11_8));
> gen_helper_ftrc_DT(cpu_fpul, cpu_env, fp);
> tcg_temp_free_i64(fp);
> }
> @@ -1750,9 +1750,9 @@ static void _decode_opc(DisasContext * ctx)
> if (ctx->opcode & 0x0100)
> break; /* illegal instruction */
> TCGv_i64 fp = tcg_temp_new_i64();
> - gen_load_fpr64(fp, DREG(B11_8));
> + gen_load_fpr64(ctx, fp, DREG(B11_8));
> gen_helper_fsqrt_DT(fp, cpu_env, fp);
> - gen_store_fpr64(fp, DREG(B11_8));
> + gen_store_fpr64(ctx, fp, DREG(B11_8));
> tcg_temp_free_i64(fp);
> } else {
> gen_helper_fsqrt_FT(FREG(B11_8), cpu_env, FREG(B11_8));
> @@ -1778,7 +1778,7 @@ static void _decode_opc(DisasContext * ctx)
> {
> TCGv_i64 fp = tcg_temp_new_i64();
> gen_helper_fcnvsd_FT_DT(fp, cpu_env, cpu_fpul);
> - gen_store_fpr64(fp, DREG(B11_8));
> + gen_store_fpr64(ctx, fp, DREG(B11_8));
> tcg_temp_free_i64(fp);
> }
> return;
> @@ -1786,7 +1786,7 @@ static void _decode_opc(DisasContext * ctx)
> CHECK_FPU_ENABLED
> {
> TCGv_i64 fp = tcg_temp_new_i64();
> - gen_load_fpr64(fp, DREG(B11_8));
> + gen_load_fpr64(ctx, fp, DREG(B11_8));
> gen_helper_fcnvds_DT_FT(cpu_fpul, cpu_env, fp);
> tcg_temp_free_i64(fp);
> }
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT Richard Henderson
2017-07-07 22:17 ` Aurelien Jarno
@ 2017-07-08 16:59 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 16:59 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:21 PM, Richard Henderson wrote:
> We do not need to emit N copies of raising an exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/translate.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 3453f19..41157a0 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -377,11 +377,8 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
> #define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
>
> #define CHECK_NOT_DELAY_SLOT \
> - if (ctx->envflags & DELAY_SLOT_MASK) { \
> - gen_save_cpu_state(ctx, true); \
> - gen_helper_raise_slot_illegal_instruction(cpu_env); \
> - ctx->bstate = BS_EXCP; \
> - return; \
> + if (ctx->envflags & DELAY_SLOT_MASK) { \
> + goto do_illegal_slot; \
> }
>
> #define CHECK_PRIVILEGED \
> @@ -1820,10 +1817,12 @@ static void _decode_opc(DisasContext * ctx)
> ctx->opcode, ctx->pc);
> fflush(stderr);
> #endif
> - gen_save_cpu_state(ctx, true);
> if (ctx->envflags & DELAY_SLOT_MASK) {
> + do_illegal_slot:
> + gen_save_cpu_state(ctx, true);
> gen_helper_raise_slot_illegal_instruction(cpu_env);
> } else {
> + gen_save_cpu_state(ctx, true);
> gen_helper_raise_illegal_instruction(cpu_env);
> }
> ctx->bstate = BS_EXCP;
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED Richard Henderson
2017-07-07 22:17 ` Aurelien Jarno
@ 2017-07-08 17:00 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 17:00 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:21 PM, Richard Henderson wrote:
> We do not need to emit N copies of raising an exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/translate.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 41157a0..dd14b43 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -381,16 +381,9 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
> goto do_illegal_slot; \
> }
>
> -#define CHECK_PRIVILEGED \
> - if (IS_USER(ctx)) { \
> - gen_save_cpu_state(ctx, true); \
> - if (ctx->envflags & DELAY_SLOT_MASK) { \
> - gen_helper_raise_slot_illegal_instruction(cpu_env); \
> - } else { \
> - gen_helper_raise_illegal_instruction(cpu_env); \
> - } \
> - ctx->bstate = BS_EXCP; \
> - return; \
> +#define CHECK_PRIVILEGED \
> + if (IS_USER(ctx)) { \
> + goto do_illegal; \
> }
>
> #define CHECK_FPU_ENABLED \
> @@ -1817,6 +1810,7 @@ static void _decode_opc(DisasContext * ctx)
> ctx->opcode, ctx->pc);
> fflush(stderr);
> #endif
> + do_illegal:
> if (ctx->envflags & DELAY_SLOT_MASK) {
> do_illegal_slot:
> gen_save_cpu_state(ctx, true);
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED Richard Henderson
2017-07-07 22:18 ` Aurelien Jarno
@ 2017-07-08 17:01 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 17:01 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:21 PM, Richard Henderson wrote:
> We do not need to emit N copies of raising an exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/translate.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index dd14b43..a4370c6 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -386,16 +386,9 @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
> goto do_illegal; \
> }
>
> -#define CHECK_FPU_ENABLED \
> - if (ctx->tbflags & (1u << SR_FD)) { \
> - gen_save_cpu_state(ctx, true); \
> - if (ctx->envflags & DELAY_SLOT_MASK) { \
> - gen_helper_raise_slot_fpu_disable(cpu_env); \
> - } else { \
> - gen_helper_raise_fpu_disable(cpu_env); \
> - } \
> - ctx->bstate = BS_EXCP; \
> - return; \
> +#define CHECK_FPU_ENABLED \
> + if (ctx->tbflags & (1u << SR_FD)) { \
> + goto do_fpu_disabled; \
> }
>
> static void _decode_opc(DisasContext * ctx)
> @@ -1820,6 +1813,17 @@ static void _decode_opc(DisasContext * ctx)
> gen_helper_raise_illegal_instruction(cpu_env);
> }
> ctx->bstate = BS_EXCP;
> + return;
> +
> + do_fpu_disabled:
> + gen_save_cpu_state(ctx, true);
> + if (ctx->envflags & DELAY_SLOT_MASK) {
> + gen_helper_raise_slot_fpu_disable(cpu_env);
> + } else {
> + gen_helper_raise_fpu_disable(cpu_env);
> + }
> + ctx->bstate = BS_EXCP;
> + return;
> }
>
> static void decode_opc(DisasContext * ctx)
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks Richard Henderson
2017-07-07 22:18 ` Aurelien Jarno
@ 2017-07-08 17:02 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 89+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-07-08 17:02 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: bruno, laurent, aurelien, glaubitz
On 07/06/2017 11:21 PM, Richard Henderson wrote:
> Now that we have a do_illegal label, use goto in order
> to self-document the forcing of the exception.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> target/sh4/translate.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index a4370c6..06cf649 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -1079,8 +1079,9 @@ static void _decode_opc(DisasContext * ctx)
> if (ctx->tbflags & FPSCR_PR) {
> TCGv_i64 fp0, fp1;
>
> - if (ctx->opcode & 0x0110)
> - break; /* illegal instruction */
> + if (ctx->opcode & 0x0110) {
> + goto do_illegal;
> + }
> fp0 = tcg_temp_new_i64();
> fp1 = tcg_temp_new_i64();
> gen_load_fpr64(ctx, fp0, B11_8);
> @@ -1142,7 +1143,7 @@ static void _decode_opc(DisasContext * ctx)
> {
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_PR) {
> - break; /* illegal instruction */
> + goto do_illegal;
> } else {
> gen_helper_fmac_FT(FREG(B11_8), cpu_env,
> FREG(0), FREG(B7_4), FREG(B11_8));
> @@ -1693,8 +1694,9 @@ static void _decode_opc(DisasContext * ctx)
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_PR) {
> TCGv_i64 fp;
> - if (ctx->opcode & 0x0100)
> - break; /* illegal instruction */
> + if (ctx->opcode & 0x0100) {
> + goto do_illegal;
> + }
> fp = tcg_temp_new_i64();
> gen_helper_float_DT(fp, cpu_env, cpu_fpul);
> gen_store_fpr64(ctx, fp, B11_8);
> @@ -1708,8 +1710,9 @@ static void _decode_opc(DisasContext * ctx)
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_PR) {
> TCGv_i64 fp;
> - if (ctx->opcode & 0x0100)
> - break; /* illegal instruction */
> + if (ctx->opcode & 0x0100) {
> + goto do_illegal;
> + }
> fp = tcg_temp_new_i64();
> gen_load_fpr64(ctx, fp, B11_8);
> gen_helper_ftrc_DT(cpu_fpul, cpu_env, fp);
> @@ -1730,8 +1733,9 @@ static void _decode_opc(DisasContext * ctx)
> case 0xf06d: /* fsqrt FRn */
> CHECK_FPU_ENABLED
> if (ctx->tbflags & FPSCR_PR) {
> - if (ctx->opcode & 0x0100)
> - break; /* illegal instruction */
> + if (ctx->opcode & 0x0100) {
> + goto do_illegal;
> + }
> TCGv_i64 fp = tcg_temp_new_i64();
> gen_load_fpr64(ctx, fp, B11_8);
> gen_helper_fsqrt_DT(fp, cpu_env, fp);
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 16:22 ` Richard Henderson
@ 2017-07-13 9:09 ` John Paul Adrian Glaubitz
2017-07-13 10:56 ` John Paul Adrian Glaubitz
0 siblings, 1 reply; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-13 9:09 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, aurelien, laurent, bruno, glaubitz
Just as a heads-up: The current set of patches as they are present in
the tgt-sh4 branch on git://github.com/rth7680/qemu.git are a HUGE
improvement for qemu-sh4 user mode. Several packages which previously
crashed during build or use can now be used properly!
So, I hope to see this series merged upstream soon! Thanks a lot!
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-13 9:09 ` John Paul Adrian Glaubitz
@ 2017-07-13 10:56 ` John Paul Adrian Glaubitz
2017-07-13 21:37 ` Richard Henderson
0 siblings, 1 reply; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-13 10:56 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, aurelien, laurent, bruno, glaubitz
On Thu, Jul 13, 2017 at 11:09:08AM +0200, John Paul Adrian Glaubitz wrote:
> Just as a heads-up: The current set of patches as they are present in
> the tgt-sh4 branch on git://github.com/rth7680/qemu.git are a HUGE
> improvement for qemu-sh4 user mode. Several packages which previously
> crashed during build or use can now be used properly!
The only thing that still doesn't work is GHC, unfortunately:
(sid-sh4-sbuild)root@nofan:/# ghc Main.hs
/bin/bash: warning: setlocale: LC_ALL: cannot change locale
(en_US.UTF-8)
[1 of 1] Compiling Main ( Main.hs, Main.o )
^C
(sid-sh4-sbuild)root@nofan:/# ghc Main.hs
/bin/bash: warning: setlocale: LC_ALL: cannot change locale
(en_US.UTF-8)
[1 of 1] Compiling Main ( Main.hs, Main.o )
^C
(sid-sh4-sbuild)root@nofan:/# ghc Main.hs
/bin/bash: warning: setlocale: LC_ALL: cannot change locale
(en_US.UTF-8)
[1 of 1] Compiling Main ( Main.hs, Main.o )
^C
(sid-sh4-sbuild)root@nofan:/# ghc Main.hs
/bin/bash: warning: setlocale: LC_ALL: cannot change locale
(en_US.UTF-8)
ghc: internal error: scavenge_static: strange closure 8
(GHC version 8.0.1 for sh4_unknown_linux)
Please report this as a GHC bug:
http://www.haskell.org/ghc/reportabug
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted
(sid-sh4-sbuild)root@nofan:/#
It either hangs, so I had to use "Ctrl+C" or it segfaulted.
Sometimes it would crash with:
(sid-sh4-sbuild)root@nofan:/# ghc Main.hs
/bin/bash: warning: setlocale: LC_ALL: cannot change locale
(en_US.UTF-8)
Unhandled trap: 0x180
pc=0x07019782 sr=0x00000000 pr=0x0668c52c fpscr=0x00080000
spc=0x00000000 ssr=0x00000000 gbr=0x7f5ae680 vbr=0x00000000
sgr=0x00000000 dbr=0x00000000 delayed_pc=0x0668c52c fpul=0x05f7595c
r0=0x07019780 r1=0x7d9f90d8 r2=0x00019780 r3=0x00009780
r4=0x00000001 r5=0x0000001e r6=0x70000000 r7=0x00000080
r8=0x00000001 r9=0x00000000 r10=0x7f5ae694 r11=0x00000000
r12=0x07099840 r13=0x7f4fe000 r14=0x7ffff4cc r15=0x7ffff4c8
r16=0x00000000 r17=0x00000000 r18=0x00000000 r19=0x00000000
r20=0x00000000 r21=0x00000000 r22=0x00000000 r23=0x00000000
(sid-sh4-sbuild)root@nofan:/#
Source file is available here [1]. In order to be able to install,
you need to bind-mount /proc into the chroot first:
mount -o bind /proc /path/to/sid-sh4-sbuild/proc/
Adrian
> [1] https://people.debian.org/~glaubitz/Main.hs
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-13 10:56 ` John Paul Adrian Glaubitz
@ 2017-07-13 21:37 ` Richard Henderson
2017-07-13 21:42 ` John Paul Adrian Glaubitz
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-13 21:37 UTC (permalink / raw)
To: John Paul Adrian Glaubitz; +Cc: qemu-devel, aurelien, laurent, bruno, glaubitz
On 07/13/2017 12:56 AM, John Paul Adrian Glaubitz wrote:
> It either hangs, so I had to use "Ctrl+C" or it segfaulted.
>
> Sometimes it would crash with:
>
> (sid-sh4-sbuild)root@nofan:/# ghc Main.hs
> /bin/bash: warning: setlocale: LC_ALL: cannot change locale
> (en_US.UTF-8)
> Unhandled trap: 0x180
> pc=0x07019782 sr=0x00000000 pr=0x0668c52c fpscr=0x00080000
> spc=0x00000000 ssr=0x00000000 gbr=0x7f5ae680 vbr=0x00000000
> sgr=0x00000000 dbr=0x00000000 delayed_pc=0x0668c52c fpul=0x05f7595c
> r0=0x07019780 r1=0x7d9f90d8 r2=0x00019780 r3=0x00009780
> r4=0x00000001 r5=0x0000001e r6=0x70000000 r7=0x00000080
> r8=0x00000001 r9=0x00000000 r10=0x7f5ae694 r11=0x00000000
> r12=0x07099840 r13=0x7f4fe000 r14=0x7ffff4cc r15=0x7ffff4c8
> r16=0x00000000 r17=0x00000000 r18=0x00000000 r19=0x00000000
> r20=0x00000000 r21=0x00000000 r22=0x00000000 r23=0x00000000
> (sid-sh4-sbuild)root@nofan:/#
>
> Source file is available here [1]. In order to be able to install,
> you need to bind-mount /proc into the chroot first:
>
> mount -o bind /proc /path/to/sid-sh4-sbuild/proc/
>
> Adrian
>
>> [1] https://people.debian.org/~glaubitz/Main.hs
I can reproduce this non-reproducible behaviour.
Unfortunately with something as complex as ghc this doesn't help much. I hope
you can find something smaller that produces a similar problem...
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-13 21:37 ` Richard Henderson
@ 2017-07-13 21:42 ` John Paul Adrian Glaubitz
0 siblings, 0 replies; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-13 21:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, aurelien, laurent, bruno, glaubitz
On 07/13/2017 11:37 PM, Richard Henderson wrote:
> Unfortunately with something as complex as ghc this doesn't help much.
> I hope you can find something smaller that produces a similar problem...
GHC is truly a complex matter. It's in fact the package that causes most
trouble on qemu-user, both on m68k and sh4. Although it works fine on
qemu-user for ARM.
I will try to reduce the test case.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics Richard Henderson
@ 2017-07-15 22:14 ` Aurelien Jarno
2017-07-15 22:16 ` John Paul Adrian Glaubitz
2017-07-16 2:30 ` Richard Henderson
0 siblings, 2 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-15 22:14 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> For uniprocessors, SH4 uses optimistic restartable atomic sequences.
> Upon an interrupt, a real kernel would simply notice magic values in
> the registers and reset the PC to the start of the sequence.
>
> For QEMU, we cannot do this in quite the same way. Instead, we notice
> the normal start of such a sequence (mov #-x,r15), and start a new TB
> that can be executed under cpu_exec_step_atomic.
Do you have actually have a good documentation about gUSA? I have found
a few documents (some of them in Japanese), the most complete one being
the LinuxTag paper. The ABI is also described in the kernel and the
glibc. That said I am missing the following informations:
- What kind of instructions are allowed in the atomic sequence? Your
patch takes into account branches, but are there allowed? used in
practice? What about FP instructions?
- Does the atomic sequence is actually allowed to cross pages?
- Is there any alignement required? The paper mention adding a nop to
gUSA_exchange_and_add to align the end point to 4 bytes.
Depending on that you patch can probably be simplified.
Overall it looks good, however please find some comments below.
> Reported-by: Bruno Haible <bruno@clisp.org>
> LP: https://bugs.launchpad.net/bugs/1701971
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/cpu.h | 18 +++++--
> target/sh4/helper.h | 1 +
> target/sh4/op_helper.c | 6 +++
> target/sh4/translate.c | 131 +++++++++++++++++++++++++++++++++++++++++++++++--
> 4 files changed, 148 insertions(+), 8 deletions(-)
>
> diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
> index da31805..e3abb6a 100644
> --- a/target/sh4/cpu.h
> +++ b/target/sh4/cpu.h
> @@ -98,7 +98,18 @@
>
> #define TB_FLAG_PENDING_MOVCA (1 << 3)
>
> -#define TB_FLAG_ENVFLAGS_MASK DELAY_SLOT_MASK
> +#define GUSA_SHIFT 4
> +#ifdef CONFIG_USER_ONLY
> +#define GUSA_EXCLUSIVE (1 << 12)
> +#define GUSA_MASK ((0xff << GUSA_SHIFT) | GUSA_EXCLUSIVE)
> +#else
> +/* Provide dummy versions of the above to allow tests against tbflags
> + to be elided while avoiding ifdefs. */
> +#define GUSA_EXCLUSIVE 0
> +#define GUSA_MASK 0
> +#endif
> +
> +#define TB_FLAG_ENVFLAGS_MASK (DELAY_SLOT_MASK | GUSA_MASK)
>
> typedef struct tlb_t {
> uint32_t vpn; /* virtual page number */
> @@ -390,8 +401,9 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
> target_ulong *cs_base, uint32_t *flags)
> {
> *pc = env->pc;
> - *cs_base = 0;
> - *flags = env->flags /* Bits 0-2 */
> + /* For a gUSA region, notice the end of the region. */
> + *cs_base = env->flags & GUSA_MASK ? env->gregs[0] : 0;
> + *flags = env->flags /* TB_FLAG_ENVFLAGS_MASK: bits 0-2, 4-12 */
> | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
> | (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
> | (env->sr & (1u << SR_FD)) /* Bit 15 */
> diff --git a/target/sh4/helper.h b/target/sh4/helper.h
> index 767a6d5..6c6fa04 100644
> --- a/target/sh4/helper.h
> +++ b/target/sh4/helper.h
> @@ -6,6 +6,7 @@ DEF_HELPER_1(raise_slot_fpu_disable, noreturn, env)
> DEF_HELPER_1(debug, noreturn, env)
> DEF_HELPER_1(sleep, noreturn, env)
> DEF_HELPER_2(trapa, noreturn, env, i32)
> +DEF_HELPER_1(exclusive, noreturn, env)
>
> DEF_HELPER_3(movcal, void, env, i32, i32)
> DEF_HELPER_1(discard_movcal_backup, void, env)
> diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
> index c3d19b1..8513f38 100644
> --- a/target/sh4/op_helper.c
> +++ b/target/sh4/op_helper.c
> @@ -115,6 +115,12 @@ void helper_trapa(CPUSH4State *env, uint32_t tra)
> raise_exception(env, 0x160, 0);
> }
>
> +void helper_exclusive(CPUSH4State *env)
> +{
> + /* We do not want cpu_restore_state to run. */
> + cpu_loop_exit_atomic(ENV_GET_CPU(env), 0);
> +}
> +
> void helper_movcal(CPUSH4State *env, uint32_t address, uint32_t value)
> {
> if (cpu_sh4_is_cached (env, address))
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index cf53cd6..653c06c 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -235,7 +235,9 @@ static inline bool use_goto_tb(DisasContext *ctx, target_ulong dest)
> if (unlikely(ctx->singlestep_enabled)) {
> return false;
> }
> -
> + if (ctx->tbflags & GUSA_EXCLUSIVE) {
> + return false;
> + }
> #ifndef CONFIG_USER_ONLY
> return (ctx->tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK);
> #else
> @@ -278,6 +280,28 @@ static void gen_conditional_jump(DisasContext * ctx,
> target_ulong ift, target_ulong ifnott)
> {
> TCGLabel *l1 = gen_new_label();
> +
> + if (ctx->tbflags & GUSA_EXCLUSIVE) {
> + /* When in an exclusive region, we must continue to the end.
> + Therefore, exit the region on a taken branch, but otherwise
> + fall through to the next instruction. */
> + uint32_t taken;
> + TCGCond cond;
> +
> + if (ift == ctx->pc + 2) {
> + taken = ifnott;
> + cond = TCG_COND_NE;
> + } else {
> + taken = ift;
> + cond = TCG_COND_EQ;
> + }
> + tcg_gen_brcondi_i32(cond, cpu_sr_t, 0, l1);
> + tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~GUSA_MASK);
> + gen_goto_tb(ctx, 0, taken);
> + gen_set_label(l1);
> + return;
> + }
> +
This looks fine. I guess this can be improved in a another patch by
changing the caller to pass a single target address and if the condition
is true or false. This would avoid having to detect that here, and would
not make the code below more complicated.
> gen_save_cpu_state(ctx, false);
> tcg_gen_brcondi_i32(TCG_COND_NE, cpu_sr_t, 0, l1);
> gen_goto_tb(ctx, 0, ifnott);
> @@ -289,13 +313,26 @@ static void gen_conditional_jump(DisasContext * ctx,
> /* Delayed conditional jump (bt or bf) */
> static void gen_delayed_conditional_jump(DisasContext * ctx)
> {
> - TCGLabel *l1;
> - TCGv ds;
> + TCGLabel *l1 = gen_new_label();
> + TCGv ds = tcg_temp_new();
>
> - l1 = gen_new_label();
> - ds = tcg_temp_new();
> tcg_gen_mov_i32(ds, cpu_delayed_cond);
> tcg_gen_discard_i32(cpu_delayed_cond);
> +
> + if (ctx->tbflags & GUSA_EXCLUSIVE) {
> + /* When in an exclusive region, we must continue to the end.
> + Therefore, exit the region on a taken branch, but otherwise
> + fall through to the next instruction. */
> + tcg_gen_brcondi_i32(TCG_COND_EQ, ds, 0, l1);
> +
> + /* Leave the gUSA region. */
> + tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~GUSA_MASK);
> + gen_jump(ctx);
> +
> + gen_set_label(l1);
> + return;
> + }
> +
> tcg_gen_brcondi_i32(TCG_COND_NE, ds, 0, l1);
> gen_goto_tb(ctx, 1, ctx->pc + 2);
> gen_set_label(l1);
> @@ -480,6 +517,15 @@ static void _decode_opc(DisasContext * ctx)
> }
> return;
> case 0xe000: /* mov #imm,Rn */
> +#ifdef CONFIG_USER_ONLY
> + /* Detect the start of a gUSA region. If so, update envflags
> + and end the TB. This will allow us to see the end of the
> + region (stored in R0) in the next TB. */
> + if (B11_8 == 15 && B7_0s < 0) {
> + ctx->envflags = deposit32(ctx->envflags, GUSA_SHIFT, 8, B7_0s);
> + ctx->bstate = BS_STOP;
> + }
> +#endif
>
> tcg_gen_movi_i32(REG(B11_8), B7_0s);
> return;
> case 0x9000: /* mov.w @(disp,PC),Rn */
> @@ -1814,6 +1860,18 @@ static void decode_opc(DisasContext * ctx)
> if (old_flags & DELAY_SLOT_MASK) {
> /* go out of the delay slot */
> ctx->envflags &= ~DELAY_SLOT_MASK;
> +
> + /* When in an exclusive region, we must continue to the end
> + for conditional branches. */
> + if (ctx->tbflags & GUSA_EXCLUSIVE
> + && old_flags & DELAY_SLOT_CONDITIONAL) {
> + gen_delayed_conditional_jump(ctx);
> + return;
> + }
> + /* Otherwise this is probably an invalid gUSA region.
> + Drop the GUSA bits so the next TB doesn't see them. */
> + ctx->envflags &= ~GUSA_MASK;
> +
> tcg_gen_movi_i32(cpu_flags, ctx->envflags);
> ctx->bstate = BS_BRANCH;
> if (old_flags & DELAY_SLOT_CONDITIONAL) {
> @@ -1821,9 +1879,60 @@ static void decode_opc(DisasContext * ctx)
> } else {
> gen_jump(ctx);
> }
> + }
> +}
>
> +#ifdef CONFIG_USER_ONLY
> +/* For uniprocessors, SH4 uses optimistic restartable atomic sequences.
> + Upon an interrupt, a real kernel would simply notice magic values in
> + the registers and reset the PC to the start of the sequence.
> +
> + For QEMU, we cannot do this in quite the same way. Instead, we notice
> + the normal start of such a sequence (mov #-x,r15). While we can handle
> + any sequence via cpu_exec_step_atomic, we can recognize the "normal"
> + sequences and transform them into atomic operations as seen by the host.
> +*/
> +static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insns)
> +{
> + uint32_t pc = ctx->pc;
> + uint32_t pc_end = ctx->tb->cs_base;
> + int backup = sextract32(ctx->tbflags, GUSA_SHIFT, 8);
> + int max_insns = (pc_end - pc) / 2;
> +
> + if (pc != pc_end + backup || max_insns < 2) {
> + /* This is a malformed gUSA region. Don't do anything special,
> + since the interpreter is likely to get confused. */
> + ctx->envflags &= ~GUSA_MASK;
> + return 0;
> + }
>
> + if (ctx->tbflags & GUSA_EXCLUSIVE) {
> + /* Regardless of single-stepping or the end of the page,
> + we must complete execution of the gUSA region while
> + holding the exclusive lock. */
> + *pmax_insns = max_insns;
> + return 0;
> }
What are the consequence of not stopping the translation when crossing
the page in user mode? If it doesn't have any, we should probably change
the code to never stop when crossing pages.
> + qemu_log_mask(LOG_UNIMP, "Unrecognized gUSA sequence %08x-%08x\n",
> + pc, pc_end);
> +
> + /* Restart with the EXCLUSIVE bit set, within a TB run via
> + cpu_exec_step_atomic holding the exclusive lock. */
> + tcg_gen_insn_start(pc, ctx->envflags);
> + ctx->envflags |= GUSA_EXCLUSIVE;
> + gen_save_cpu_state(ctx, false);
> + gen_helper_exclusive(cpu_env);
> + ctx->bstate = BS_EXCP;
> +
> + /* We're not executing an instruction, but we must report one for the
> + purposes of accounting within the TB. We might as well report the
> + entire region consumed via ctx->pc so that it's immediately available
> + in the disassembly dump. */
> + ctx->pc = pc_end;
> + return 1;
> }
> +#endif
>
> void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> {
> @@ -1869,6 +1978,12 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> gen_tb_start(tb);
> num_insns = 0;
>
> +#ifdef CONFIG_USER_ONLY
> + if (ctx.tbflags & GUSA_MASK) {
> + num_insns = decode_gusa(&ctx, env, &max_insns);
> + }
> +#endif
> +
> while (ctx.bstate == BS_NONE
> && num_insns < max_insns
> && !tcg_op_buf_full()) {
> @@ -1899,6 +2014,12 @@ void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
> if (tb->cflags & CF_LAST_IO) {
> gen_io_end();
> }
> +
> + if (ctx.tbflags & GUSA_EXCLUSIVE) {
> + /* Ending the region of exclusivity. Clear the bits. */
> + ctx.envflags &= ~GUSA_MASK;
> + }
> +
IIUC this assumes the number of instructions in the sequence is always
executed. I guess this is not correct if the TCG op buffer is full. Some
non-privileged instructions might also stop the translation, but they
are all FPU instructions, so I guess the section is simply not valid in
that case.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-15 22:14 ` Aurelien Jarno
@ 2017-07-15 22:16 ` John Paul Adrian Glaubitz
2017-07-16 2:30 ` Richard Henderson
1 sibling, 0 replies; 89+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-07-15 22:16 UTC (permalink / raw)
To: Aurelien Jarno, Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 07/16/2017 12:14 AM, Aurelien Jarno wrote:
> Do you have actually have a good documentation about gUSA? I have found
> a few documents (some of them in Japanese), the most complete one being
> the LinuxTag paper. The ABI is also described in the kernel and the
> glibc. That said I am missing the following informations:
> - What kind of instructions are allowed in the atomic sequence? Your
> patch takes into account branches, but are there allowed? used in
> practice? What about FP instructions?
> - Does the atomic sequence is actually allowed to cross pages?
> - Is there any alignement required? The paper mention adding a nop to
> gUSA_exchange_and_add to align the end point to 4 bytes.
The best person to answer this is Yutaka Niibe as he is actually the
person who came up with the design. I'll drop him a message and see
if he can join the discussion.
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - glaubitz@debian.org
`. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery Richard Henderson
2017-07-07 7:25 ` John Paul Adrian Glaubitz
2017-07-07 9:05 ` [Qemu-devel] [PATCH v2 08/27] " Laurent Vivier
@ 2017-07-15 22:52 ` Aurelien Jarno
2 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-15 22:52 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> We translate gUSA regions atomically in a parallel context.
> But in a serial context a gUSA region may be interrupted.
> In that case, restart the region as the kernel would.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> linux-user/signal.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
With my limited knowledge of linux-user:
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries Richard Henderson
@ 2017-07-15 22:59 ` Aurelien Jarno
2017-07-16 2:33 ` Richard Henderson
0 siblings, 1 reply; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-15 22:59 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> If a signal is delivered during the execution of a delay slot,
> or a gUSA region, clear those bits from the environment so that
> the signal handler does not start in that same state.
How are signals delivered in linux-user? At least in system mode we
forbid interrupts in the delay slot (see commit 5c6f3eb7db), as the
manual clearly declare them as indivisible. Maybe the same should be
done for linux-user?
>
> Cleaning the bits on signal return is paranoid good sense.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> linux-user/signal.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/linux-user/signal.c b/linux-user/signal.c
> index a537778..8c0b851 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -3544,6 +3544,7 @@ static void restore_sigcontext(CPUSH4State *regs, struct target_sigcontext *sc)
> __get_user(regs->fpul, &sc->sc_fpul);
>
> regs->tra = -1; /* disable syscall checks */
> + regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
> }
>
> static void setup_frame(int sig, struct target_sigaction *ka,
Why not using TB_FLAG_ENVFLAGS_MASK introduced earlier in this patch
series?
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco Richard Henderson
@ 2017-07-15 23:22 ` Aurelien Jarno
2017-07-16 21:55 ` Aurelien Jarno
0 siblings, 1 reply; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-15 23:22 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> As for other targets, cmpxchg isn't quite right for ll/sc,
> suffering from an ABA race, but is sufficient to implement
> portable atomic operations.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/cpu.h | 3 ++-
> target/sh4/translate.c | 56 +++++++++++++++++++++++++++++++++-----------------
> 2 files changed, 39 insertions(+), 20 deletions(-)
For the linux-user case, where we need to emulate sequences that needs
to be executed on multiple CPUs, while the ISA has been designed for
a single CPU, this patch looks good. There is no real other way to do
it.
For the system case, one might imagine using MOVLI/MOVCO with a
different address, although 1) it hasn't been designed for that 2) all
the sequences I have found use the same address. I therefore wonder if
we should just add the code to correctly clear LDST in case of interrupt
or exception.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-15 22:14 ` Aurelien Jarno
2017-07-15 22:16 ` John Paul Adrian Glaubitz
@ 2017-07-16 2:30 ` Richard Henderson
2017-07-16 15:18 ` Aurelien Jarno
1 sibling, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-16 2:30 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel, laurent, bruno, glaubitz
On 07/15/2017 12:14 PM, Aurelien Jarno wrote:
> On 2017-07-06 16:20, Richard Henderson wrote:
>> For uniprocessors, SH4 uses optimistic restartable atomic sequences.
>> Upon an interrupt, a real kernel would simply notice magic values in
>> the registers and reset the PC to the start of the sequence.
>>
>> For QEMU, we cannot do this in quite the same way. Instead, we notice
>> the normal start of such a sequence (mov #-x,r15), and start a new TB
>> that can be executed under cpu_exec_step_atomic.
>
> Do you have actually have a good documentation about gUSA? I have found
> a few documents (some of them in Japanese), the most complete one being
> the LinuxTag paper. The ABI is also described in the kernel and the
> glibc. That said I am missing the following informations:
Kernel sources and glibc are good. The description in GCC is pretty good as well:
https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/config/sh/sync.md?revision=243994&view=markup#l53
> - What kind of instructions are allowed in the atomic sequence? Your
> patch takes into account branches, but are there allowed? used in
> practice? What about FP instructions?
Any sequence that can be restarted.
> - Does the atomic sequence is actually allowed to cross pages?
I don't see why not. It would be restarted on a page fault, but assuming that
a process can have 2 pages in ram doesn't seem onerous.
> - Is there any alignement required? The paper mention adding a nop to
> gUSA_exchange_and_add to align the end point to 4 bytes.
That's about the mov.l instruction that computes the address.
> This looks fine. I guess this can be improved in a another patch by
> changing the caller to pass a single target address and if the condition
> is true or false. This would avoid having to detect that here, and would
> not make the code below more complicated.
Sure.
>> + if (ctx->tbflags & GUSA_EXCLUSIVE) {
>> + /* Regardless of single-stepping or the end of the page,
>> + we must complete execution of the gUSA region while
>> + holding the exclusive lock. */
>> + *pmax_insns = max_insns;
>> + return 0;
>> }
>
> What are the consequence of not stopping the translation when crossing
> the page in user mode? If it doesn't have any, we should probably change
> the code to never stop when crossing pages.
The only consequence is recognizing a segv "too early". It's a quirk that,
frankly, I'm willing to accept.
>> + if (ctx.tbflags & GUSA_EXCLUSIVE) {
>> + /* Ending the region of exclusivity. Clear the bits. */
>> + ctx.envflags &= ~GUSA_MASK;
>> + }
>> +
>
> IIUC this assumes the number of instructions in the sequence is always
> executed. I guess this is not correct if the TCG op buffer is full. Some
> non-privileged instructions might also stop the translation, but they
> are all FPU instructions, so I guess the section is simply not valid in
> that case.
The TCG op buffer is always empty when we begin the sequence. Since we're
limited to 128 bytes, or 64 instructions, I'm not concerned about running out.
I attempt to make sure that all of the paths out -- like exceptions and
branches -- that don't do what we want have GUSA state zapped.
At which point ya gets what ya gets. Not atomic, but not really wrong either.
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries
2017-07-15 22:59 ` Aurelien Jarno
@ 2017-07-16 2:33 ` Richard Henderson
2017-07-16 15:18 ` Aurelien Jarno
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-16 2:33 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel, laurent, bruno, glaubitz
On 07/15/2017 12:59 PM, Aurelien Jarno wrote:
> On 2017-07-06 16:20, Richard Henderson wrote:
>> If a signal is delivered during the execution of a delay slot,
>> or a gUSA region, clear those bits from the environment so that
>> the signal handler does not start in that same state.
>
> How are signals delivered in linux-user? At least in system mode we
> forbid interrupts in the delay slot (see commit 5c6f3eb7db), as the
> manual clearly declare them as indivisible. Maybe the same should be
> done for linux-user?
Signals get queued, and delivered eventually. I don't believe that we do
anything to check that "signals can't be delivered yet" like we do in system mode.
>> + regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
>> }
>>
>> static void setup_frame(int sig, struct target_sigaction *ka,
>
> Why not using TB_FLAG_ENVFLAGS_MASK introduced earlier in this patch
> series?
I really want to clear these two sets. I didn't want to assume that
ENVFLAGS_MASK would never contain anything else.
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-16 2:30 ` Richard Henderson
@ 2017-07-16 15:18 ` Aurelien Jarno
2017-07-16 19:35 ` Richard Henderson
0 siblings, 1 reply; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-16 15:18 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-15 16:30, Richard Henderson wrote:
> On 07/15/2017 12:14 PM, Aurelien Jarno wrote:
> > On 2017-07-06 16:20, Richard Henderson wrote:
> > > For uniprocessors, SH4 uses optimistic restartable atomic sequences.
> > > Upon an interrupt, a real kernel would simply notice magic values in
> > > the registers and reset the PC to the start of the sequence.
> > >
> > > For QEMU, we cannot do this in quite the same way. Instead, we notice
> > > the normal start of such a sequence (mov #-x,r15), and start a new TB
> > > that can be executed under cpu_exec_step_atomic.
> >
> > Do you have actually have a good documentation about gUSA? I have found
> > a few documents (some of them in Japanese), the most complete one being
> > the LinuxTag paper. The ABI is also described in the kernel and the
> > glibc. That said I am missing the following informations:
>
> Kernel sources and glibc are good. The description in GCC is pretty good as well:
>
> https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/config/sh/sync.md?revision=243994&view=markup#l53
Thanks for the pointer.
> > - What kind of instructions are allowed in the atomic sequence? Your
> > patch takes into account branches, but are there allowed? used in
> > practice? What about FP instructions?
>
> Any sequence that can be restarted.
Ok. So I guess it means a sequence with branch taken or not taken is
possible. The same way I guess that another instruction than a mov to
set a negative stack pointer and start the sequence is also possible,
also not emitted by glibc or gcc.
Therefore we should consider this patchset to cover 99% of the
most common case. Of course the 1% remaining cases are not handled as
atomic, but besides that kept being correctly emulated.
> > - Does the atomic sequence is actually allowed to cross pages?
>
> I don't see why not. It would be restarted on a page fault, but assuming
> that a process can have 2 pages in ram doesn't seem onerous.
Ok. Same comment as above.
> > - Is there any alignement required? The paper mention adding a nop to
> > gUSA_exchange_and_add to align the end point to 4 bytes.
>
> That's about the mov.l instruction that computes the address.
I guess you mean mova. That make senses, and actually doesn't impact
QEMU.
> > This looks fine. I guess this can be improved in a another patch by
> > changing the caller to pass a single target address and if the condition
> > is true or false. This would avoid having to detect that here, and would
> > not make the code below more complicated.
>
> Sure.
>
> > > + if (ctx->tbflags & GUSA_EXCLUSIVE) {
> > > + /* Regardless of single-stepping or the end of the page,
> > > + we must complete execution of the gUSA region while
> > > + holding the exclusive lock. */
> > > + *pmax_insns = max_insns;
> > > + return 0;
> > > }
> >
> > What are the consequence of not stopping the translation when crossing
> > the page in user mode? If it doesn't have any, we should probably change
> > the code to never stop when crossing pages.
>
> The only consequence is recognizing a segv "too early". It's a quirk that,
> frankly, I'm willing to accept.
Ok.
> > > + if (ctx.tbflags & GUSA_EXCLUSIVE) {
> > > + /* Ending the region of exclusivity. Clear the bits. */
> > > + ctx.envflags &= ~GUSA_MASK;
> > > + }
> > > +
> >
> > IIUC this assumes the number of instructions in the sequence is always
> > executed. I guess this is not correct if the TCG op buffer is full. Some
> > non-privileged instructions might also stop the translation, but they
> > are all FPU instructions, so I guess the section is simply not valid in
> > that case.
>
> The TCG op buffer is always empty when we begin the sequence. Since we're
> limited to 128 bytes, or 64 instructions, I'm not concerned about running
> out. I attempt to make sure that all of the paths out -- like exceptions and
> branches -- that don't do what we want have GUSA state zapped.
>
> At which point ya gets what ya gets. Not atomic, but not really wrong either.
>
Thanks for all the details, things are much clearer now. Therefore:
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
That said for further improvements did you consider decoding the gUSA
section in a helper. It might avoid having to emulate the atomic
sequence with 3 TBs in the worst case (the original one, the one to
decode the sequence and the one holding the exclusive lock). The helper
should directly have access to the r0 value, can decode the atomic
sequence and translate it into a call to the corresponding atomic
helpers. In the best case that means the sequence can be done in the
same TB. In the worst case, where the gUSA sequence is not recognized
it means two TB.
I am might have forgotten something though.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries
2017-07-16 2:33 ` Richard Henderson
@ 2017-07-16 15:18 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-16 15:18 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-15 16:33, Richard Henderson wrote:
> On 07/15/2017 12:59 PM, Aurelien Jarno wrote:
> > On 2017-07-06 16:20, Richard Henderson wrote:
> > > If a signal is delivered during the execution of a delay slot,
> > > or a gUSA region, clear those bits from the environment so that
> > > the signal handler does not start in that same state.
> >
> > How are signals delivered in linux-user? At least in system mode we
> > forbid interrupts in the delay slot (see commit 5c6f3eb7db), as the
> > manual clearly declare them as indivisible. Maybe the same should be
> > done for linux-user?
>
> Signals get queued, and delivered eventually. I don't believe that we do
> anything to check that "signals can't be delivered yet" like we do in system
> mode.
Ok. I think it might be a good idea to implement that. I am not always
sure that running the signal handler between an instruction and the
corresponding delay slot will lead to correct behaviour.
> > > + regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
> > > }
> > > static void setup_frame(int sig, struct target_sigaction *ka,
> >
> > Why not using TB_FLAG_ENVFLAGS_MASK introduced earlier in this patch
> > series?
>
> I really want to clear these two sets. I didn't want to assume that
> ENVFLAGS_MASK would never contain anything else.
>
Fair enough. Therefore:
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-16 15:18 ` Aurelien Jarno
@ 2017-07-16 19:35 ` Richard Henderson
2017-07-16 21:43 ` Aurelien Jarno
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-16 19:35 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: qemu-devel, laurent, bruno, glaubitz
On 07/16/2017 05:18 AM, Aurelien Jarno wrote:
> That said for further improvements did you consider decoding the gUSA
> section in a helper. It might avoid having to emulate the atomic
> sequence with 3 TBs in the worst case (the original one, the one to
> decode the sequence and the one holding the exclusive lock). The helper
> should directly have access to the r0 value, can decode the atomic
> sequence and translate it into a call to the corresponding atomic
> helpers. In the best case that means the sequence can be done in the
> same TB. In the worst case, where the gUSA sequence is not recognized
> it means two TB.
I did not consider decoding the sequence in a helper.
I do want to cache the result in a TB, so that when we execute the atomic
sequence for a second time we do not have to re-interpret it. And I don't see
how to do that with a helper.
I *could* remember a mova into R0 earlier within the same block that performs
the mov #imm,r15. It's not guaranteed to be in the same TB, but looking
through dumps from debian unstable it's highly likely.
That could end up with the atomic sequence in the same TB that set SP, or with
the EXCP_ATOMIC exception in the same TB.
If it's ok with you, I'd prefer to handle that as a follow up. We would still
have to have the current setup as a backup just in case the mova isn't visible.
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-16 19:35 ` Richard Henderson
@ 2017-07-16 21:43 ` Aurelien Jarno
2017-07-16 21:59 ` Richard Henderson
0 siblings, 1 reply; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-16 21:43 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-16 09:35, Richard Henderson wrote:
> On 07/16/2017 05:18 AM, Aurelien Jarno wrote:
> > That said for further improvements did you consider decoding the gUSA
> > section in a helper. It might avoid having to emulate the atomic
> > sequence with 3 TBs in the worst case (the original one, the one to
> > decode the sequence and the one holding the exclusive lock). The helper
> > should directly have access to the r0 value, can decode the atomic
> > sequence and translate it into a call to the corresponding atomic
> > helpers. In the best case that means the sequence can be done in the
> > same TB. In the worst case, where the gUSA sequence is not recognized
> > it means two TB.
>
> I did not consider decoding the sequence in a helper.
>
> I do want to cache the result in a TB, so that when we execute the atomic
> sequence for a second time we do not have to re-interpret it. And I don't
> see how to do that with a helper.
Indeed, if the same atomic code is used often it might be better to have
it cached. That said it's only true for TB that are recognized, as IIRC
TB with the exclusive lock are not cached.
> I *could* remember a mova into R0 earlier within the same block that
> performs the mov #imm,r15. It's not guaranteed to be in the same TB, but
> looking through dumps from debian unstable it's highly likely.
>
> That could end up with the atomic sequence in the same TB that set SP, or
> with the EXCP_ATOMIC exception in the same TB.
That would be a nice improvement indeed.
> If it's ok with you, I'd prefer to handle that as a follow up. We would
> still have to have the current setup as a backup just in case the mova isn't
> visible.
Yes, of course. I am fine with this version of the patch (hence the
R-b). I was just giving the ideas I got when reviewing this patch before
I forget about them.
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco
2017-07-15 23:22 ` Aurelien Jarno
@ 2017-07-16 21:55 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-16 21:55 UTC (permalink / raw)
To: Richard Henderson; +Cc: bruno, qemu-devel, glaubitz, laurent
On 2017-07-16 01:22, Aurelien Jarno wrote:
> On 2017-07-06 16:20, Richard Henderson wrote:
> > As for other targets, cmpxchg isn't quite right for ll/sc,
> > suffering from an ABA race, but is sufficient to implement
> > portable atomic operations.
> >
> > Signed-off-by: Richard Henderson <rth@twiddle.net>
> > ---
> > target/sh4/cpu.h | 3 ++-
> > target/sh4/translate.c | 56 +++++++++++++++++++++++++++++++++-----------------
> > 2 files changed, 39 insertions(+), 20 deletions(-)
>
> For the linux-user case, where we need to emulate sequences that needs
> to be executed on multiple CPUs, while the ISA has been designed for
> a single CPU, this patch looks good. There is no real other way to do
> it.
>
> For the system case, one might imagine using MOVLI/MOVCO with a
> different address, although 1) it hasn't been designed for that 2) all
> the sequences I have found use the same address. I therefore wonder if
> we should just add the code to correctly clear LDST in case of interrupt
> or exception.
I guess the patch for the system case is as simple as:
--- a/target/sh4/helper.c
+++ b/target/sh4/helper.c
@@ -86,6 +86,9 @@ void superh_cpu_do_interrupt(CPUState *cs)
int do_irq = cs->interrupt_request & CPU_INTERRUPT_HARD;
int do_exp, irq_vector = cs->exception_index;
+ /* LDST flag is cleared by an exception or an interrupt. */
+ env->ldst = 0;
+
/* prioritize exceptions over interrupts */
do_exp = cs->exception_index != -1;
Of course to integrate it with your patch it means adding the #ifdef
#else #endif around the system and the user version.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-16 21:43 ` Aurelien Jarno
@ 2017-07-16 21:59 ` Richard Henderson
2017-07-16 22:16 ` Aurelien Jarno
0 siblings, 1 reply; 89+ messages in thread
From: Richard Henderson @ 2017-07-16 21:59 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: bruno, qemu-devel, glaubitz, laurent
On 07/16/2017 11:43 AM, Aurelien Jarno wrote:
> Indeed, if the same atomic code is used often it might be better to have
> it cached. That said it's only true for TB that are recognized, as IIRC
> TB with the exclusive lock are not cached.
At the moment they are not.
But in Emilio's multi-threaded tcg v2 patch set, posted just today, we change
that by adding a CF_PARALLEL flag to TB->cflags, and incorporating that into
the tb hash.
r~
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
2017-07-16 21:59 ` Richard Henderson
@ 2017-07-16 22:16 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-16 22:16 UTC (permalink / raw)
To: Richard Henderson; +Cc: bruno, qemu-devel, glaubitz, laurent
On 2017-07-16 11:59, Richard Henderson wrote:
> On 07/16/2017 11:43 AM, Aurelien Jarno wrote:
> > Indeed, if the same atomic code is used often it might be better to have
> > it cached. That said it's only true for TB that are recognized, as IIRC
> > TB with the exclusive lock are not cached.
>
> At the moment they are not.
>
> But in Emilio's multi-threaded tcg v2 patch set, posted just today, we
> change that by adding a CF_PARALLEL flag to TB->cflags, and incorporating
> that into the tb hash.
Thanks for the hint, I'll have a look. Unfortunately I have one week
backlog on the mailing list...
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences Richard Henderson
@ 2017-07-17 14:10 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-17 14:10 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> For many of the sequences produced by gcc or glibc,
> we can translate these as host atomic operations.
> Which saves the need to acquire the exclusive lock.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> target/sh4/translate.c | 316 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 316 insertions(+)
>
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index 653c06c..73b3e02 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -1894,10 +1894,17 @@ static void decode_opc(DisasContext * ctx)
> */
> static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insns)
> {
> + uint16_t insns[5];
> + int ld_adr, ld_dst, ld_mop;
> + int op_dst, op_src, op_opc;
> + int mv_src, mt_dst, st_src, st_mop;
> + TCGv op_arg;
> +
> uint32_t pc = ctx->pc;
> uint32_t pc_end = ctx->tb->cs_base;
> int backup = sextract32(ctx->tbflags, GUSA_SHIFT, 8);
> int max_insns = (pc_end - pc) / 2;
> + int i;
>
> if (pc != pc_end + backup || max_insns < 2) {
> /* This is a malformed gUSA region. Don't do anything special,
> @@ -1914,6 +1921,315 @@ static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insns)
> return 0;
> }
>
> + /* The state machine below will consume only a few insns.
> + If there are more than that in a region, fail now. */
> + if (max_insns > ARRAY_SIZE(insns)) {
> + goto fail;
> + }
> +
> + /* Read all of the insns for the region. */
> + for (i = 0; i < max_insns; ++i) {
> + insns[i] = cpu_lduw_code(env, pc + i * 2);
> + }
> +
> + ld_adr = ld_dst = ld_mop = -1;
> + mv_src = -1;
> + op_dst = op_src = op_opc = -1;
> + mt_dst = -1;
> + st_src = st_mop = -1;
> + TCGV_UNUSED(op_arg);
> + i = 0;
> +
> +#define NEXT_INSN \
> + do { if (i >= max_insns) goto fail; ctx->opcode = insns[i++]; } while (0)
> +
> + /*
> + * Expect a load to begin the region.
> + */
> + NEXT_INSN;
> + switch (ctx->opcode & 0xf00f) {
> + case 0x6000: /* mov.b @Rm,Rn */
> + ld_mop = MO_SB;
> + break;
> + case 0x6001: /* mov.w @Rm,Rn */
> + ld_mop = MO_TESW;
> + break;
> + case 0x6002: /* mov.l @Rm,Rn */
> + ld_mop = MO_TESL;
> + break;
> + default:
> + goto fail;
> + }
> + ld_adr = B7_4;
> + ld_dst = B11_8;
> + if (ld_adr == ld_dst) {
> + goto fail;
> + }
> + /* Unless we see a mov, any two-operand operation must use ld_dst. */
> + op_dst = ld_dst;
> +
> + /*
> + * Expect an optional register move.
> + */
> + NEXT_INSN;
> + switch (ctx->opcode & 0xf00f) {
> + case 0x6003: /* mov Rm,Rn */
> + /* Here we want to recognize ld_dst being saved for later consumtion,
> + or for another input register being copied so that ld_dst need not
> + be clobbered during the operation. */
> + op_dst = B11_8;
> + mv_src = B7_4;
> + if (op_dst == ld_dst) {
> + /* Overwriting the load output. */
> + goto fail;
> + }
> + if (mv_src != ld_dst) {
> + /* Copying a new input; constrain op_src to match the load. */
> + op_src = ld_dst;
> + }
> + break;
> +
> + default:
> + /* Put back and re-examine as operation. */
> + --i;
> + }
> +
> + /*
> + * Expect the operation.
> + */
> + NEXT_INSN;
> + switch (ctx->opcode & 0xf00f) {
> + case 0x300c: /* add Rm,Rn */
> + op_opc = INDEX_op_add_i32;
> + goto do_reg_op;
> + case 0x2009: /* and Rm,Rn */
> + op_opc = INDEX_op_and_i32;
> + goto do_reg_op;
> + case 0x200a: /* xor Rm,Rn */
> + op_opc = INDEX_op_xor_i32;
> + goto do_reg_op;
> + case 0x200b: /* or Rm,Rn */
> + op_opc = INDEX_op_or_i32;
> + do_reg_op:
> + /* The operation register should be as expected, and the
> + other input cannot depend on the load. */
> + if (op_dst != B11_8) {
> + goto fail;
> + }
> + if (op_src < 0) {
> + /* Unconstrainted input. */
> + op_src = B7_4;
> + } else if (op_src == B7_4) {
> + /* Constrained input matched load. All operations are
> + commutative; "swap" them by "moving" the load output
> + to the (implicit) first argument and the move source
> + to the (explicit) second argument. */
> + op_src = mv_src;
> + } else {
> + goto fail;
> + }
> + op_arg = REG(op_src);
> + break;
> +
> + case 0x6007: /* not Rm,Rn */
> + if (ld_dst != B7_4 || mv_src >= 0) {
> + goto fail;
> + }
> + op_dst = B11_8;
> + op_opc = INDEX_op_xor_i32;
> + op_arg = tcg_const_i32(-1);
This temp is never freed. Same for a few others below.
Overall, parsing the atomic sequence ends up being complex. I have
verified the most common sequences from GCC or GLIBC, and your code
seems fine for at least those cases.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] Fwd: [PATCH v2.5] fixup! linux-user/sh4: Notice gUSA regions during signal delivery
2017-07-07 19:00 ` Richard Henderson
@ 2017-07-17 14:15 ` Aurelien Jarno
0 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-17 14:15 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On 2017-07-07 09:00, Richard Henderson wrote:
> On 07/07/2017 07:57 AM, Richard Henderson wrote:
> > + /* ??? The SH4 kernel checks for and address above 0xC0000000.
> > + However, the page mappings in qemu linux-user aren't as restricted
> > + and we wind up with the normal stack mapped above 0xF0000000.
> > + That said, there is no reason why the kernel should be allowing
> > + a gUSA region that spans 1GB. Use a tighter check here, for what
> > + can actually be enabled by the immediate move. */
>
> Additionally, I can (and should) fix the address space problem for SH4 in
> linux-user/main.c, where we have already done so for MIPS and Nios2.
>
> See the initialization of reserved_va.
I guess it's what you have done in the "linux-user fixes for va
mapping", which renders this version 2.5 obsolete. Therefore I guess the
version 2 is the one to be used instead.
Unfortunately my knowledge of linux-user is rather limited to review
this new series.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
` (26 preceding siblings ...)
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 27/27] target/sh4: Use tcg_gen_lookup_and_goto_ptr Richard Henderson
@ 2017-07-18 7:51 ` Aurelien Jarno
27 siblings, 0 replies; 89+ messages in thread
From: Aurelien Jarno @ 2017-07-18 7:51 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent, bruno, glaubitz
On 2017-07-06 16:20, Richard Henderson wrote:
> This fixes two problems with atomic operations on sh4,
> including an attempt at supporting the user-space atomics
> technique used by most sh-linux-user binaries.
>
> Changes since v1:
> * Rebase on Aurelien's recent sh4 patchset.
> * Patch 3,5 split out of patch 6.
> * Patch 4 fixes the sh4-softmmu problem that Aurelien reported.
> * Handle more cases of atomic_fetch_op seen in debian images.
> * More cleanups for register banking.
> * Fix for 64-bit fp memory operations.
> * Tidy illegal instruction checks.
> * Implement fpchg (missing from sh4a)
> * Implement fsrra (had a nop implementation)
> * Use tcg_gen_lookup_and_goto_ptr for simple branches.
I have finally finished reviewing all patches, how do you want to
proceed from there? Do you want to post a new version or should I just
pick your patches and apply the small fixes that are needed?
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 89+ messages in thread
end of thread, other threads:[~2017-07-18 7:54 UTC | newest]
Thread overview: 89+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-07 2:20 [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Richard Henderson
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 01/27] target/sh4: Use cmpxchg for movco Richard Henderson
2017-07-15 23:22 ` Aurelien Jarno
2017-07-16 21:55 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 02/27] target/sh4: Consolidate end-of-TB tests Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 03/27] target/sh4: Introduce TB_FLAG_ENVFLAGS_MASK Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:29 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:31 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA Richard Henderson
2017-07-07 21:42 ` Aurelien Jarno
2017-07-08 16:31 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics Richard Henderson
2017-07-15 22:14 ` Aurelien Jarno
2017-07-15 22:16 ` John Paul Adrian Glaubitz
2017-07-16 2:30 ` Richard Henderson
2017-07-16 15:18 ` Aurelien Jarno
2017-07-16 19:35 ` Richard Henderson
2017-07-16 21:43 ` Aurelien Jarno
2017-07-16 21:59 ` Richard Henderson
2017-07-16 22:16 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences Richard Henderson
2017-07-17 14:10 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery Richard Henderson
2017-07-07 7:25 ` John Paul Adrian Glaubitz
2017-07-07 8:20 ` Richard Henderson
2017-07-07 8:30 ` John Paul Adrian Glaubitz
2017-07-07 8:35 ` John Paul Adrian Glaubitz
2017-07-07 16:22 ` Richard Henderson
2017-07-13 9:09 ` John Paul Adrian Glaubitz
2017-07-13 10:56 ` John Paul Adrian Glaubitz
2017-07-13 21:37 ` Richard Henderson
2017-07-13 21:42 ` John Paul Adrian Glaubitz
[not found] ` <20170707163826.22631-1-rth@twiddle.net>
2017-07-07 17:57 ` [Qemu-devel] Fwd: [PATCH v2.5] fixup! " Richard Henderson
2017-07-07 19:00 ` Richard Henderson
2017-07-17 14:15 ` Aurelien Jarno
2017-07-07 9:05 ` [Qemu-devel] [PATCH v2 08/27] " Laurent Vivier
2017-07-07 9:09 ` Laurent Vivier
2017-07-07 9:13 ` John Paul Adrian Glaubitz
2017-07-15 22:52 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries Richard Henderson
2017-07-15 22:59 ` Aurelien Jarno
2017-07-16 2:33 ` Richard Henderson
2017-07-16 15:18 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 10/27] target/sh4: Hoist register bank selection Richard Henderson
2017-07-07 21:48 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 11/27] target/sh4: Unify cpu_fregs into FREG Richard Henderson
2017-07-07 21:54 ` Aurelien Jarno
2017-07-08 16:54 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 12/27] target/sh4: Pass DisasContext to fpr64 routines Richard Henderson
2017-07-07 21:55 ` Aurelien Jarno
2017-07-08 16:56 ` Philippe Mathieu-Daudé
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 13/27] target/sh4: Hoist fp register bank selection Richard Henderson
2017-07-07 21:57 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 14/27] target/sh4: Eliminate unused XREG macro Richard Henderson
2017-07-07 21:59 ` Aurelien Jarno
2017-07-07 2:20 ` [Qemu-devel] [PATCH v2 15/27] target/sh4: Merge DREG into fpr64 routines Richard Henderson
2017-07-07 22:06 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 16/27] target/sh4: Load/store Dr as 64-bit quantities Richard Henderson
2017-07-07 22:14 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 17/27] target/sh4: Simplify 64-bit fp reg-reg move Richard Henderson
2017-07-07 22:15 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 18/27] target/sh4: Unify code for CHECK_NOT_DELAY_SLOT Richard Henderson
2017-07-07 22:17 ` Aurelien Jarno
2017-07-08 16:59 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 19/27] target/sh4: Unify code for CHECK_PRIVILEGED Richard Henderson
2017-07-07 22:17 ` Aurelien Jarno
2017-07-08 17:00 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 20/27] target/sh4: Unify code for CHECK_FPU_ENABLED Richard Henderson
2017-07-07 22:18 ` Aurelien Jarno
2017-07-08 17:01 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 21/27] target/sh4: Tidy misc illegal insn checks Richard Henderson
2017-07-07 22:18 ` Aurelien Jarno
2017-07-08 17:02 ` Philippe Mathieu-Daudé
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 22/27] target/sh4: Introduce CHECK_FPSCR_PR_* Richard Henderson
2017-07-07 22:20 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 23/27] target/sh4: Introduce CHECK_SH4A Richard Henderson
2017-07-07 22:21 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 24/27] target/sh4: Implement fpchg Richard Henderson
2017-07-07 22:23 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 25/27] target/sh4: Add missing FPSCR.PR == 0 checks Richard Henderson
2017-07-07 22:24 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 26/27] target/sh4: Implement fsrra Richard Henderson
2017-07-07 22:27 ` Aurelien Jarno
2017-07-07 2:21 ` [Qemu-devel] [PATCH v2 27/27] target/sh4: Use tcg_gen_lookup_and_goto_ptr Richard Henderson
2017-07-18 7:51 ` [Qemu-devel] [PATCH v2 00/27] target/sh4 improvements Aurelien Jarno
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).