* [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext
@ 2023-04-27 22:59 Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER Taylor Simpson
` (20 more replies)
0 siblings, 21 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
This patch series achieves two major goals
Goal 1: Short-circuit packet semantics
In certain cases, we can avoid the overhead of writing to
hex_new_value and write directly to hex_gpr.
Here's a simple example of the TCG generated for
0x004000b4: 0x7800c020 { R0 = #0x1 }
BEFORE:
---- 004000b4
movi_i32 new_r0,$0x1
mov_i32 r0,new_r0
AFTER:
---- 004000b4
movi_i32 r0,$0x1
Goal 2: Move bookkeeping items from CPUHexagonState to DisasContext
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Several fields in CPUHexagonState are only used for bookkeeping
within the translation of a packet. With recent changes to eliminate
the need to free TCGv variables, these make more sense to be
transient and kept in DisasContext.
This patch series can be divided into 3 main parts
Part 1: Patches 1-9
Cleanup in preparation for parts 2 and 3
The main goal is to move functionality out of generated helpers
Part 2: Patches 10-15
Short-circuit packet semantics
Part 3: Patches 16-21
Move bookkeeping items from CPUHexagonState to DisasContext
**** Changes in v2 ****
Address feedback from Richard Henderson <<richard.henderson@linaro.org>
Cleaner implementation of gen_frame_scramble
Add g_assert_not_reached() in gen_framecheck
Move TCGv allocation inside gen_frame_scramble
Change tcg_gen_brcond_tl to tcg_gen_movcond_tl
Change static inline to G_GNUC_UNUSED
Removed in later patch
Change tcg_gen_not_i64 + tcg_gen_and_i64 to tcg_gen_andc_i64
Use full constant in gen_slotval
Taylor Simpson (21):
meson.build Add CONFIG_HEXAGON_IDEF_PARSER
Hexagon (target/hexagon) Add DisasContext arg to gen_log_reg_write
Hexagon (target/hexagon) Add overrides for loop setup instructions
Hexagon (target/hexagon) Add overrides for allocframe/deallocframe
Hexagon (target/hexagon) Add overrides for clr[tf]new
Hexagon (target/hexagon) Remove log_reg_write from op_helper.[ch]
Hexagon (target/hexagon) Eliminate uses of log_pred_write function
Hexagon (target/hexagon) Clean up pred_written usage
Hexagon (target/hexagon) Don't overlap dest writes with source reads
Hexagon (target/hexagon) Mark registers as read during packet analysis
Hexagon (target/hexagon) Short-circuit packet register writes
Hexagon (target/hexagon) Short-circuit packet predicate writes
Hexagon (target/hexagon) Short-circuit packet HVX writes
Hexagon (target/hexagon) Short-circuit more HVX single instruction
packets
Hexagon (target/hexagon) Add overrides for disabled idef-parser insns
Hexagon (target/hexagon) Make special new_value for USR
Hexagon (target/hexagon) Move new_value to DisasContext
Hexagon (target/hexagon) Move new_pred_value to DisasContext
Hexagon (target/hexagon) Move pred_written to DisasContext
Hexagon (target/hexagon) Move pkt_has_store_s1 to DisasContext
Hexagon (target/hexagon) Move items to DisasContext
meson.build | 1 +
target/hexagon/cpu.h | 10 +-
target/hexagon/gen_tcg.h | 116 ++++++-
target/hexagon/gen_tcg_hvx.h | 23 ++
target/hexagon/genptr.h | 6 +-
target/hexagon/helper.h | 6 +-
target/hexagon/macros.h | 57 ++--
target/hexagon/op_helper.h | 16 +-
target/hexagon/translate.h | 52 ++-
target/hexagon/attribs_def.h.inc | 6 +-
target/hexagon/arch.c | 3 +-
target/hexagon/cpu.c | 5 +-
target/hexagon/genptr.c | 347 ++++++++++++++++----
target/hexagon/idef-parser/parser-helpers.c | 4 +-
target/hexagon/op_helper.c | 154 ++++++---
target/hexagon/translate.c | 272 ++++++++++-----
tests/tcg/hexagon/hvx_misc.c | 21 ++
tests/tcg/hexagon/read_write_overlap.c | 136 ++++++++
target/hexagon/README | 6 +-
target/hexagon/gen_analyze_funcs.py | 51 ++-
target/hexagon/gen_helper_funcs.py | 9 +-
target/hexagon/gen_helper_protos.py | 10 +-
target/hexagon/gen_idef_parser_funcs.py | 7 +
target/hexagon/gen_tcg_funcs.py | 21 +-
target/hexagon/hex_common.py | 16 +-
tests/tcg/hexagon/Makefile.target | 1 +
26 files changed, 1063 insertions(+), 293 deletions(-)
create mode 100644 tests/tcg/hexagon/read_write_overlap.c
--
2.25.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-28 8:01 ` Richard Henderson
2023-04-27 22:59 ` [PATCH v2 02/21] Hexagon (target/hexagon) Add DisasContext arg to gen_log_reg_write Taylor Simpson
` (19 subsequent siblings)
20 siblings, 1 reply; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
Enable conditional compilation depending on whether idef-parser
is configured
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
meson.build | 1 +
1 file changed, 1 insertion(+)
diff --git a/meson.build b/meson.build
index c44d05a13f..d4e438b033 100644
--- a/meson.build
+++ b/meson.build
@@ -1859,6 +1859,7 @@ endif
config_host_data.set('CONFIG_GTK', gtk.found())
config_host_data.set('CONFIG_VTE', vte.found())
config_host_data.set('CONFIG_GTK_CLIPBOARD', have_gtk_clipboard)
+config_host_data.set('CONFIG_HEXAGON_IDEF_PARSER', get_option('hexagon_idef_parser'))
config_host_data.set('CONFIG_LIBATTR', have_old_libattr)
config_host_data.set('CONFIG_LIBCAP_NG', libcap_ng.found())
config_host_data.set('CONFIG_EBPF', libbpf.found())
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 02/21] Hexagon (target/hexagon) Add DisasContext arg to gen_log_reg_write
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 03/21] Hexagon (target/hexagon) Add overrides for loop setup instructions Taylor Simpson
` (18 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
Add DisasContext arg to gen_log_reg_write_pair also
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/gen_tcg.h | 2 +-
target/hexagon/genptr.h | 2 +-
target/hexagon/genptr.c | 10 +++++-----
target/hexagon/idef-parser/parser-helpers.c | 2 +-
target/hexagon/README | 2 +-
target/hexagon/gen_tcg_funcs.py | 8 +++++---
6 files changed, 14 insertions(+), 12 deletions(-)
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 329e7a1024..060c11f6c0 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -515,7 +515,7 @@
do { \
TCGv_i64 RddV = get_result_gpr_pair(ctx, HEX_REG_FP); \
gen_return(ctx, RddV, hex_gpr[HEX_REG_FP]); \
- gen_log_reg_write_pair(HEX_REG_FP, RddV); \
+ gen_log_reg_write_pair(ctx, HEX_REG_FP, RddV); \
} while (0)
/*
diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
index 76e497aa48..75d0fc262d 100644
--- a/target/hexagon/genptr.h
+++ b/target/hexagon/genptr.h
@@ -35,7 +35,7 @@ void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, uint32_t slot);
void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, uint32_t slot);
TCGv gen_read_reg(TCGv result, int num);
TCGv gen_read_preg(TCGv pred, uint8_t num);
-void gen_log_reg_write(int rnum, TCGv val);
+void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val);
void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val);
void gen_set_usr_field(DisasContext *ctx, int field, TCGv val);
void gen_set_usr_fieldi(DisasContext *ctx, int field, int x);
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 502c85ae35..12c72cbac9 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -81,7 +81,7 @@ static TCGv_i64 get_result_gpr_pair(DisasContext *ctx, int rnum)
return result;
}
-void gen_log_reg_write(int rnum, TCGv val)
+void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val)
{
const target_ulong reg_mask = reg_immut_masks[rnum];
@@ -93,7 +93,7 @@ void gen_log_reg_write(int rnum, TCGv val)
}
}
-static void gen_log_reg_write_pair(int rnum, TCGv_i64 val)
+static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 val)
{
const target_ulong reg_mask_low = reg_immut_masks[rnum];
const target_ulong reg_mask_high = reg_immut_masks[rnum + 1];
@@ -231,7 +231,7 @@ static inline void gen_write_ctrl_reg(DisasContext *ctx, int reg_num,
if (reg_num == HEX_REG_P3_0_ALIASED) {
gen_write_p3_0(ctx, val);
} else {
- gen_log_reg_write(reg_num, val);
+ gen_log_reg_write(ctx, reg_num, val);
if (reg_num == HEX_REG_QEMU_PKT_CNT) {
ctx->num_packets = 0;
}
@@ -255,7 +255,7 @@ static inline void gen_write_ctrl_reg_pair(DisasContext *ctx, int reg_num,
tcg_gen_extrh_i64_i32(val32, val);
tcg_gen_mov_tl(result, val32);
} else {
- gen_log_reg_write_pair(reg_num, val);
+ gen_log_reg_write_pair(ctx, reg_num, val);
if (reg_num == HEX_REG_QEMU_PKT_CNT) {
ctx->num_packets = 0;
ctx->num_insns = 0;
@@ -719,7 +719,7 @@ static void gen_cond_return_subinsn(DisasContext *ctx, TCGCond cond, TCGv pred)
{
TCGv_i64 RddV = get_result_gpr_pair(ctx, HEX_REG_FP);
gen_cond_return(ctx, RddV, hex_gpr[HEX_REG_FP], pred, cond);
- gen_log_reg_write_pair(HEX_REG_FP, RddV);
+ gen_log_reg_write_pair(ctx, HEX_REG_FP, RddV);
}
static void gen_endloop0(DisasContext *ctx)
diff --git a/target/hexagon/idef-parser/parser-helpers.c b/target/hexagon/idef-parser/parser-helpers.c
index 86511efb62..ae0f60ada4 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -1318,7 +1318,7 @@ void gen_write_reg(Context *c, YYLTYPE *locp, HexValue *reg, HexValue *value)
value_m = rvalue_materialize(c, locp, &value_m);
OUT(c,
locp,
- "gen_log_reg_write(", ®->reg.id, ", ",
+ "gen_log_reg_write(ctx, ", ®->reg.id, ", ",
&value_m, ");\n");
}
diff --git a/target/hexagon/README b/target/hexagon/README
index ebafc78b1c..fe90df63e8 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -87,7 +87,7 @@ tcg_funcs_generated.c.inc
TCGv RsV = hex_gpr[insn->regno[1]];
TCGv RtV = hex_gpr[insn->regno[2]];
gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
- gen_log_reg_write(RdN, RdV);
+ gen_log_reg_write(ctx, RdN, RdV);
}
helper_funcs_generated.c.inc
diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index fcb3384480..d9ccbe63f6 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -387,7 +387,8 @@ def gen_helper_call_imm(f, immlett):
def genptr_dst_write_pair(f, tag, regtype, regid):
- f.write(f" gen_log_reg_write_pair({regtype}{regid}N, " f"{regtype}{regid}V);\n")
+ f.write(f" gen_log_reg_write_pair(ctx, {regtype}{regid}N, "
+ f"{regtype}{regid}V);\n")
def genptr_dst_write(f, tag, regtype, regid):
@@ -396,7 +397,8 @@ def genptr_dst_write(f, tag, regtype, regid):
genptr_dst_write_pair(f, tag, regtype, regid)
elif regid in {"d", "e", "x", "y"}:
f.write(
- f" gen_log_reg_write({regtype}{regid}N, " f"{regtype}{regid}V);\n"
+ f" gen_log_reg_write(ctx, {regtype}{regid}N, "
+ f"{regtype}{regid}V);\n"
)
else:
print("Bad register parse: ", regtype, regid)
@@ -481,7 +483,7 @@ def genptr_dst_write_opn(f, regtype, regid, tag):
## TCGv RsV = hex_gpr[insn->regno[1]];
## TCGv RtV = hex_gpr[insn->regno[2]];
## <GEN>
-## gen_log_reg_write(RdN, RdV);
+## gen_log_reg_write(ctx, RdN, RdV);
## }
##
## where <GEN> depends on hex_common.skip_qemu_helper(tag)
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 03/21] Hexagon (target/hexagon) Add overrides for loop setup instructions
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 02/21] Hexagon (target/hexagon) Add DisasContext arg to gen_log_reg_write Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 04/21] Hexagon (target/hexagon) Add overrides for allocframe/deallocframe Taylor Simpson
` (17 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/gen_tcg.h | 21 +++++++++++++++++++
target/hexagon/genptr.c | 44 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+)
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 060c11f6c0..5774af4a59 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -663,6 +663,27 @@
#define fGEN_TCG_J2_callrf(SHORTCODE) \
gen_cond_callr(ctx, TCG_COND_NE, PuV, RsV)
+#define fGEN_TCG_J2_loop0r(SHORTCODE) \
+ gen_loop0r(ctx, RsV, riV)
+#define fGEN_TCG_J2_loop1r(SHORTCODE) \
+ gen_loop1r(ctx, RsV, riV)
+#define fGEN_TCG_J2_loop0i(SHORTCODE) \
+ gen_loop0i(ctx, UiV, riV)
+#define fGEN_TCG_J2_loop1i(SHORTCODE) \
+ gen_loop1i(ctx, UiV, riV)
+#define fGEN_TCG_J2_ploop1sr(SHORTCODE) \
+ gen_ploopNsr(ctx, 1, RsV, riV)
+#define fGEN_TCG_J2_ploop1si(SHORTCODE) \
+ gen_ploopNsi(ctx, 1, UiV, riV)
+#define fGEN_TCG_J2_ploop2sr(SHORTCODE) \
+ gen_ploopNsr(ctx, 2, RsV, riV)
+#define fGEN_TCG_J2_ploop2si(SHORTCODE) \
+ gen_ploopNsi(ctx, 2, UiV, riV)
+#define fGEN_TCG_J2_ploop3sr(SHORTCODE) \
+ gen_ploopNsr(ctx, 3, RsV, riV)
+#define fGEN_TCG_J2_ploop3si(SHORTCODE) \
+ gen_ploopNsi(ctx, 3, UiV, riV)
+
#define fGEN_TCG_J2_endloop0(SHORTCODE) \
gen_endloop0(ctx)
#define fGEN_TCG_J2_endloop1(SHORTCODE) \
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 12c72cbac9..4c34da8407 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -518,6 +518,50 @@ static void gen_compare(TCGCond cond, TCGv res, TCGv arg1, TCGv arg2)
tcg_gen_movcond_tl(cond, res, arg1, arg2, one, zero);
}
+#ifndef CONFIG_HEXAGON_IDEF_PARSER
+static inline void gen_loop0r(DisasContext *ctx, TCGv RsV, int riV)
+{
+ fIMMEXT(riV);
+ fPCALIGN(riV);
+ gen_log_reg_write(ctx, HEX_REG_LC0, RsV);
+ gen_log_reg_write(ctx, HEX_REG_SA0, tcg_constant_tl(ctx->pkt->pc + riV));
+ gen_set_usr_fieldi(ctx, USR_LPCFG, 0);
+}
+
+static void gen_loop0i(DisasContext *ctx, int count, int riV)
+{
+ gen_loop0r(ctx, tcg_constant_tl(count), riV);
+}
+
+static inline void gen_loop1r(DisasContext *ctx, TCGv RsV, int riV)
+{
+ fIMMEXT(riV);
+ fPCALIGN(riV);
+ gen_log_reg_write(ctx, HEX_REG_LC1, RsV);
+ gen_log_reg_write(ctx, HEX_REG_SA1, tcg_constant_tl(ctx->pkt->pc + riV));
+}
+
+static void gen_loop1i(DisasContext *ctx, int count, int riV)
+{
+ gen_loop1r(ctx, tcg_constant_tl(count), riV);
+}
+
+static void gen_ploopNsr(DisasContext *ctx, int N, TCGv RsV, int riV)
+{
+ fIMMEXT(riV);
+ fPCALIGN(riV);
+ gen_log_reg_write(ctx, HEX_REG_LC0, RsV);
+ gen_log_reg_write(ctx, HEX_REG_SA0, tcg_constant_tl(ctx->pkt->pc + riV));
+ gen_set_usr_fieldi(ctx, USR_LPCFG, N);
+ gen_log_pred_write(ctx, 3, tcg_constant_tl(0));
+}
+
+static void gen_ploopNsi(DisasContext *ctx, int N, int count, int riV)
+{
+ gen_ploopNsr(ctx, N, tcg_constant_tl(count), riV);
+}
+#endif
+
static void gen_cond_jumpr(DisasContext *ctx, TCGv dst_pc,
TCGCond cond, TCGv pred)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 04/21] Hexagon (target/hexagon) Add overrides for allocframe/deallocframe
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (2 preceding siblings ...)
2023-04-27 22:59 ` [PATCH v2 03/21] Hexagon (target/hexagon) Add overrides for loop setup instructions Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 05/21] Hexagon (target/hexagon) Add overrides for clr[tf]new Taylor Simpson
` (16 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/gen_tcg.h | 32 +++++++++++++++++++++++++++
target/hexagon/genptr.c | 47 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 79 insertions(+)
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 5774af4a59..7c5cb93297 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -500,6 +500,38 @@
#define fGEN_TCG_Y2_icinva(SHORTCODE) \
do { RsV = RsV; } while (0)
+/*
+ * allocframe(#uiV)
+ * RxV == r29
+ */
+#define fGEN_TCG_S2_allocframe(SHORTCODE) \
+ gen_allocframe(ctx, RxV, uiV)
+
+/* sub-instruction version (no RxV, so handle it manually) */
+#define fGEN_TCG_SS2_allocframe(SHORTCODE) \
+ do { \
+ TCGv r29 = tcg_temp_new(); \
+ tcg_gen_mov_tl(r29, hex_gpr[HEX_REG_SP]); \
+ gen_allocframe(ctx, r29, uiV); \
+ gen_log_reg_write(ctx, HEX_REG_SP, r29); \
+ } while (0)
+
+/*
+ * Rdd32 = deallocframe(Rs32):raw
+ * RddV == r31:30
+ * RsV == r30
+ */
+#define fGEN_TCG_L2_deallocframe(SHORTCODE) \
+ gen_deallocframe(ctx, RddV, RsV)
+
+/* sub-instruction version (no RddV/RsV, so handle it manually) */
+#define fGEN_TCG_SL2_deallocframe(SHORTCODE) \
+ do { \
+ TCGv_i64 r31_30 = tcg_temp_new_i64(); \
+ gen_deallocframe(ctx, r31_30, hex_gpr[HEX_REG_FP]); \
+ gen_log_reg_write_pair(ctx, HEX_REG_FP, r31_30); \
+ } while (0)
+
/*
* dealloc_return
* Assembler mapped to
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 4c34da8407..43f6c6fb9f 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -709,6 +709,18 @@ static void gen_cond_callr(DisasContext *ctx,
gen_set_label(skip);
}
+#ifndef CONFIG_HEXAGON_IDEF_PARSER
+/* frame = ((LR << 32) | FP) ^ (FRAMEKEY << 32)) */
+static TCGv_i64 gen_frame_scramble(void)
+{
+ TCGv_i64 frame = tcg_temp_new_i64();
+ TCGv tmp = tcg_temp_new();
+ tcg_gen_xor_tl(tmp, hex_gpr[HEX_REG_LR], hex_gpr[HEX_REG_FRAMEKEY]);
+ tcg_gen_concat_i32_i64(frame, hex_gpr[HEX_REG_FP], tmp);
+ return frame;
+}
+#endif
+
/* frame ^= (int64_t)FRAMEKEY << 32 */
static void gen_frame_unscramble(TCGv_i64 frame)
{
@@ -725,6 +737,41 @@ static void gen_load_frame(DisasContext *ctx, TCGv_i64 frame, TCGv EA)
tcg_gen_qemu_ld64(frame, EA, ctx->mem_idx);
}
+#ifndef CONFIG_HEXAGON_IDEF_PARSER
+/* Stack overflow check */
+static void gen_framecheck(TCGv EA, int framesize)
+{
+ /* Not modelled in linux-user mode */
+ /* Placeholder for system mode */
+#ifndef CONFIG_USER_ONLY
+ g_assert_not_reached();
+#endif
+}
+
+static void gen_allocframe(DisasContext *ctx, TCGv r29, int framesize)
+{
+ TCGv r30 = tcg_temp_new();
+ TCGv_i64 frame;
+ tcg_gen_addi_tl(r30, r29, -8);
+ frame = gen_frame_scramble();
+ gen_store8(cpu_env, r30, frame, ctx->insn->slot);
+ gen_log_reg_write(ctx, HEX_REG_FP, r30);
+ gen_framecheck(r30, framesize);
+ tcg_gen_subi_tl(r29, r30, framesize);
+}
+
+static void gen_deallocframe(DisasContext *ctx, TCGv_i64 r31_30, TCGv r30)
+{
+ TCGv r29 = tcg_temp_new();
+ TCGv_i64 frame = tcg_temp_new_i64();
+ gen_load_frame(ctx, frame, r30);
+ gen_frame_unscramble(frame);
+ tcg_gen_mov_i64(r31_30, frame);
+ tcg_gen_addi_tl(r29, r30, 8);
+ gen_log_reg_write(ctx, HEX_REG_SP, r29);
+}
+#endif
+
static void gen_return(DisasContext *ctx, TCGv_i64 dst, TCGv src)
{
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 05/21] Hexagon (target/hexagon) Add overrides for clr[tf]new
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (3 preceding siblings ...)
2023-04-27 22:59 ` [PATCH v2 04/21] Hexagon (target/hexagon) Add overrides for allocframe/deallocframe Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-28 8:04 ` Richard Henderson
2023-04-27 22:59 ` [PATCH v2 06/21] Hexagon (target/hexagon) Remove log_reg_write from op_helper.[ch] Taylor Simpson
` (15 subsequent siblings)
20 siblings, 1 reply; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
These instructions have implicit reads from p0, so we don't want
them in helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
target/hexagon/gen_tcg.h | 16 ++++++++++++++++
target/hexagon/macros.h | 4 ----
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 7c5cb93297..f3e9c280b0 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -1097,6 +1097,22 @@
gen_jump(ctx, riV); \
} while (0)
+/* if (p0.new) r0 = #0 */
+#define fGEN_TCG_SA1_clrtnew(SHORTCODE) \
+ do { \
+ tcg_gen_movcond_tl(TCG_COND_EQ, RdV, \
+ hex_new_pred_value[0], tcg_constant_tl(0), \
+ RdV, tcg_constant_tl(0)); \
+ } while (0)
+
+/* if (!p0.new) r0 = #0 */
+#define fGEN_TCG_SA1_clrfnew(SHORTCODE) \
+ do { \
+ tcg_gen_movcond_tl(TCG_COND_NE, RdV, \
+ hex_new_pred_value[0], tcg_constant_tl(0), \
+ RdV, tcg_constant_tl(0)); \
+ } while (0)
+
#define fGEN_TCG_J2_pause(SHORTCODE) \
do { \
uiV = uiV; \
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 3e162de3a7..2cb0647ce2 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -227,12 +227,8 @@ static inline void gen_cancel(uint32_t slot)
#ifdef QEMU_GENERATE
#define fLSBNEW(PVAL) tcg_gen_andi_tl(LSB, (PVAL), 1)
-#define fLSBNEW0 tcg_gen_andi_tl(LSB, hex_new_pred_value[0], 1)
-#define fLSBNEW1 tcg_gen_andi_tl(LSB, hex_new_pred_value[1], 1)
#else
#define fLSBNEW(PVAL) ((PVAL) & 1)
-#define fLSBNEW0 (env->new_pred_value[0] & 1)
-#define fLSBNEW1 (env->new_pred_value[1] & 1)
#endif
#ifdef QEMU_GENERATE
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 06/21] Hexagon (target/hexagon) Remove log_reg_write from op_helper.[ch]
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (4 preceding siblings ...)
2023-04-27 22:59 ` [PATCH v2 05/21] Hexagon (target/hexagon) Add overrides for clr[tf]new Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 07/21] Hexagon (target/hexagon) Eliminate uses of log_pred_write function Taylor Simpson
` (14 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
With the overrides added in prior commits, this function is not used
Remove references in macros.h
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/macros.h | 14 --------------
target/hexagon/op_helper.h | 4 ----
target/hexagon/op_helper.c | 17 -----------------
3 files changed, 35 deletions(-)
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 2cb0647ce2..94a676fbf9 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -343,10 +343,6 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
#define fREAD_LR() (env->gpr[HEX_REG_LR])
-#define fWRITE_LR(A) log_reg_write(env, HEX_REG_LR, A)
-#define fWRITE_FP(A) log_reg_write(env, HEX_REG_FP, A)
-#define fWRITE_SP(A) log_reg_write(env, HEX_REG_SP, A)
-
#define fREAD_SP() (env->gpr[HEX_REG_SP])
#define fREAD_LC0 (env->gpr[HEX_REG_LC0])
#define fREAD_LC1 (env->gpr[HEX_REG_LC1])
@@ -371,16 +367,6 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
#define fBRANCH(LOC, TYPE) fWRITE_NPC(LOC)
#define fJUMPR(REGNO, TARGET, TYPE) fBRANCH(TARGET, COF_TYPE_JUMPR)
#define fHINTJR(TARGET) { /* Not modelled in qemu */}
-#define fWRITE_LOOP_REGS0(START, COUNT) \
- do { \
- log_reg_write(env, HEX_REG_LC0, COUNT); \
- log_reg_write(env, HEX_REG_SA0, START); \
- } while (0)
-#define fWRITE_LOOP_REGS1(START, COUNT) \
- do { \
- log_reg_write(env, HEX_REG_LC1, COUNT); \
- log_reg_write(env, HEX_REG_SA1, START);\
- } while (0)
#define fSET_OVERFLOW() SET_USR_FIELD(USR_OVF, 1)
#define fSET_LPCFG(VAL) SET_USR_FIELD(USR_LPCFG, (VAL))
diff --git a/target/hexagon/op_helper.h b/target/hexagon/op_helper.h
index db22b54401..6bd4b07849 100644
--- a/target/hexagon/op_helper.h
+++ b/target/hexagon/op_helper.h
@@ -19,15 +19,11 @@
#define HEXAGON_OP_HELPER_H
/* Misc functions */
-void write_new_pc(CPUHexagonState *env, bool pkt_has_multi_cof, target_ulong addr);
-
uint8_t mem_load1(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
uint16_t mem_load2(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
uint32_t mem_load4(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
uint64_t mem_load8(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
-void log_reg_write(CPUHexagonState *env, int rnum,
- target_ulong val);
void log_store64(CPUHexagonState *env, target_ulong addr,
int64_t val, int width, int slot);
void log_store32(CPUHexagonState *env, target_ulong addr,
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 3cc71b69d9..7e9e3f305e 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -52,23 +52,6 @@ G_NORETURN void HELPER(raise_exception)(CPUHexagonState *env, uint32_t excp)
do_raise_exception_err(env, excp, 0);
}
-void log_reg_write(CPUHexagonState *env, int rnum,
- target_ulong val)
-{
- HEX_DEBUG_LOG("log_reg_write[%d] = " TARGET_FMT_ld " (0x" TARGET_FMT_lx ")",
- rnum, val, val);
- if (val == env->gpr[rnum]) {
- HEX_DEBUG_LOG(" NO CHANGE");
- }
- HEX_DEBUG_LOG("\n");
-
- env->new_value[rnum] = val;
- if (HEX_DEBUG) {
- /* Do this so HELPER(debug_commit_end) will know */
- env->reg_written[rnum] = 1;
- }
-}
-
static void log_pred_write(CPUHexagonState *env, int pnum, target_ulong val)
{
HEX_DEBUG_LOG("log_pred_write[%d] = " TARGET_FMT_ld
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 07/21] Hexagon (target/hexagon) Eliminate uses of log_pred_write function
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (5 preceding siblings ...)
2023-04-27 22:59 ` [PATCH v2 06/21] Hexagon (target/hexagon) Remove log_reg_write from op_helper.[ch] Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 08/21] Hexagon (target/hexagon) Clean up pred_written usage Taylor Simpson
` (13 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
The following instructions are overriden
S2_cabacdecbin
SA1_cmpeqi
Remove the log_pred_write function from op_helper.c
Remove references in macros.h
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/gen_tcg.h | 16 +++++++
target/hexagon/helper.h | 2 +
target/hexagon/macros.h | 4 --
target/hexagon/genptr.c | 5 ++
target/hexagon/op_helper.c | 96 ++++++++++++++++++++++++++++++++------
5 files changed, 104 insertions(+), 19 deletions(-)
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index f3e9c280b0..2b2a6175a5 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -595,6 +595,14 @@
gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV); \
} while (0)
+#define fGEN_TCG_S2_cabacdecbin(SHORTCODE) \
+ do { \
+ TCGv p0 = tcg_temp_new(); \
+ gen_helper_cabacdecbin_pred(p0, RssV, RttV); \
+ gen_helper_cabacdecbin_val(RddV, RssV, RttV); \
+ gen_log_pred_write(ctx, 0, p0); \
+ } while (0)
+
/*
* Approximate reciprocal
* r3,p1 = sfrecipa(r0, r1)
@@ -900,6 +908,14 @@
#define fGEN_TCG_J4_tstbit0_fp1_jump_t(SHORTCODE) \
gen_cmpnd_tstbit0_jmp(ctx, 1, RsV, TCG_COND_NE, riV)
+/* p0 = cmp.eq(r0, #7) */
+#define fGEN_TCG_SA1_cmpeqi(SHORTCODE) \
+ do { \
+ TCGv p0 = tcg_temp_new(); \
+ gen_comparei(TCG_COND_EQ, p0, RsV, uiV); \
+ gen_log_pred_write(ctx, 0, p0); \
+ } while (0)
+
#define fGEN_TCG_J2_jump(SHORTCODE) \
gen_jump(ctx, riV)
#define fGEN_TCG_J2_jumpr(SHORTCODE) \
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index ed7f9842f6..73849e3d49 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -31,6 +31,8 @@ DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
DEF_HELPER_2(sfinvsqrta, i64, env, f32)
DEF_HELPER_4(vacsh_val, s64, env, s64, s64, s64)
DEF_HELPER_FLAGS_4(vacsh_pred, TCG_CALL_NO_RWG_SE, s32, env, s64, s64, s64)
+DEF_HELPER_FLAGS_2(cabacdecbin_val, TCG_CALL_NO_RWG_SE, s64, s64, s64)
+DEF_HELPER_FLAGS_2(cabacdecbin_pred, TCG_CALL_NO_RWG_SE, s32, s64, s64)
/* Floating point */
DEF_HELPER_2(conv_sf2df, f64, env, f32)
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 94a676fbf9..16e72ed0d5 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -371,10 +371,6 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
#define fSET_OVERFLOW() SET_USR_FIELD(USR_OVF, 1)
#define fSET_LPCFG(VAL) SET_USR_FIELD(USR_LPCFG, (VAL))
#define fGET_LPCFG (GET_USR_FIELD(USR_LPCFG))
-#define fWRITE_P0(VAL) log_pred_write(env, 0, VAL)
-#define fWRITE_P1(VAL) log_pred_write(env, 1, VAL)
-#define fWRITE_P2(VAL) log_pred_write(env, 2, VAL)
-#define fWRITE_P3(VAL) log_pred_write(env, 3, VAL)
#define fPART1(WORK) if (part1) { WORK; return; }
#define fCAST4u(A) ((uint32_t)(A))
#define fCAST4s(A) ((int32_t)(A))
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 43f6c6fb9f..cde5cff06a 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -560,6 +560,11 @@ static void gen_ploopNsi(DisasContext *ctx, int N, int count, int riV)
{
gen_ploopNsr(ctx, N, tcg_constant_tl(count), riV);
}
+
+static inline void gen_comparei(TCGCond cond, TCGv res, TCGv arg1, int arg2)
+{
+ gen_compare(cond, res, arg1, tcg_constant_tl(arg2));
+}
#endif
static void gen_cond_jumpr(DisasContext *ctx, TCGv dst_pc,
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 7e9e3f305e..46ccc59106 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -52,21 +52,6 @@ G_NORETURN void HELPER(raise_exception)(CPUHexagonState *env, uint32_t excp)
do_raise_exception_err(env, excp, 0);
}
-static void log_pred_write(CPUHexagonState *env, int pnum, target_ulong val)
-{
- HEX_DEBUG_LOG("log_pred_write[%d] = " TARGET_FMT_ld
- " (0x" TARGET_FMT_lx ")\n",
- pnum, val, val);
-
- /* Multiple writes to the same preg are and'ed together */
- if (env->pred_written & (1 << pnum)) {
- env->new_pred_value[pnum] &= val & 0xff;
- } else {
- env->new_pred_value[pnum] = val & 0xff;
- env->pred_written |= 1 << pnum;
- }
-}
-
void log_store32(CPUHexagonState *env, target_ulong addr,
target_ulong val, int width, int slot)
{
@@ -399,6 +384,87 @@ int32_t HELPER(vacsh_pred)(CPUHexagonState *env,
return PeV;
}
+int64_t HELPER(cabacdecbin_val)(int64_t RssV, int64_t RttV)
+{
+ int64_t RddV = 0;
+ size4u_t state;
+ size4u_t valMPS;
+ size4u_t bitpos;
+ size4u_t range;
+ size4u_t offset;
+ size4u_t rLPS;
+ size4u_t rMPS;
+
+ state = fEXTRACTU_RANGE(fGETWORD(1, RttV), 5, 0);
+ valMPS = fEXTRACTU_RANGE(fGETWORD(1, RttV), 8, 8);
+ bitpos = fEXTRACTU_RANGE(fGETWORD(0, RttV), 4, 0);
+ range = fGETWORD(0, RssV);
+ offset = fGETWORD(1, RssV);
+
+ /* calculate rLPS */
+ range <<= bitpos;
+ offset <<= bitpos;
+ rLPS = rLPS_table_64x4[state][(range >> 29) & 3];
+ rLPS = rLPS << 23; /* left aligned */
+
+ /* calculate rMPS */
+ rMPS = (range & 0xff800000) - rLPS;
+
+ /* most probable region */
+ if (offset < rMPS) {
+ RddV = AC_next_state_MPS_64[state];
+ fINSERT_RANGE(RddV, 8, 8, valMPS);
+ fINSERT_RANGE(RddV, 31, 23, (rMPS >> 23));
+ fSETWORD(1, RddV, offset);
+ }
+ /* least probable region */
+ else {
+ RddV = AC_next_state_LPS_64[state];
+ fINSERT_RANGE(RddV, 8, 8, ((!state) ? (1 - valMPS) : (valMPS)));
+ fINSERT_RANGE(RddV, 31, 23, (rLPS >> 23));
+ fSETWORD(1, RddV, (offset - rMPS));
+ }
+ return RddV;
+}
+
+int32_t HELPER(cabacdecbin_pred)(int64_t RssV, int64_t RttV)
+{
+ int32_t p0 = 0;
+ size4u_t state;
+ size4u_t valMPS;
+ size4u_t bitpos;
+ size4u_t range;
+ size4u_t offset;
+ size4u_t rLPS;
+ size4u_t rMPS;
+
+ state = fEXTRACTU_RANGE(fGETWORD(1, RttV), 5, 0);
+ valMPS = fEXTRACTU_RANGE(fGETWORD(1, RttV), 8, 8);
+ bitpos = fEXTRACTU_RANGE(fGETWORD(0, RttV), 4, 0);
+ range = fGETWORD(0, RssV);
+ offset = fGETWORD(1, RssV);
+
+ /* calculate rLPS */
+ range <<= bitpos;
+ offset <<= bitpos;
+ rLPS = rLPS_table_64x4[state][(range >> 29) & 3];
+ rLPS = rLPS << 23; /* left aligned */
+
+ /* calculate rMPS */
+ rMPS = (range & 0xff800000) - rLPS;
+
+ /* most probable region */
+ if (offset < rMPS) {
+ p0 = valMPS;
+
+ }
+ /* least probable region */
+ else {
+ p0 = valMPS ^ 1;
+ }
+ return p0;
+}
+
static void probe_store(CPUHexagonState *env, int slot, int mmu_idx,
bool is_predicated)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 08/21] Hexagon (target/hexagon) Clean up pred_written usage
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (6 preceding siblings ...)
2023-04-27 22:59 ` [PATCH v2 07/21] Hexagon (target/hexagon) Eliminate uses of log_pred_write function Taylor Simpson
@ 2023-04-27 22:59 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 09/21] Hexagon (target/hexagon) Don't overlap dest writes with source reads Taylor Simpson
` (12 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 22:59 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
Only endloop instructions will conditionally write to a predicate.
When there is an endloop instruction, we preload the values into
new_pred_value.
The only place pred_written is needed is when HEX_DEBUG is on.
We remove the last use of check_for_attrib. However, new uses will be
introduced later in this series, so we mark it with G_GNUC_UNUSED.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/genptr.c | 16 +++++-------
target/hexagon/translate.c | 53 ++++++++++++--------------------------
2 files changed, 23 insertions(+), 46 deletions(-)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index cde5cff06a..2014a8068a 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -137,7 +137,9 @@ void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
tcg_gen_and_tl(hex_new_pred_value[pnum],
hex_new_pred_value[pnum], base_val);
}
- tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
+ if (HEX_DEBUG) {
+ tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
+ }
set_bit(pnum, ctx->pregs_written);
}
@@ -826,15 +828,13 @@ static void gen_endloop0(DisasContext *ctx)
/*
* if (lpcfg == 1) {
- * hex_new_pred_value[3] = 0xff;
- * hex_pred_written |= 1 << 3;
+ * p3 = 0xff;
* }
*/
TCGLabel *label1 = gen_new_label();
tcg_gen_brcondi_tl(TCG_COND_NE, lpcfg, 1, label1);
{
- tcg_gen_movi_tl(hex_new_pred_value[3], 0xff);
- tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << 3);
+ gen_log_pred_write(ctx, 3, tcg_constant_tl(0xff));
}
gen_set_label(label1);
@@ -903,14 +903,12 @@ static void gen_endloop01(DisasContext *ctx)
/*
* if (lpcfg == 1) {
- * hex_new_pred_value[3] = 0xff;
- * hex_pred_written |= 1 << 3;
+ * p3 = 0xff;
* }
*/
tcg_gen_brcondi_tl(TCG_COND_NE, lpcfg, 1, label1);
{
- tcg_gen_movi_tl(hex_new_pred_value[3], 0xff);
- tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << 3);
+ gen_log_pred_write(ctx, 3, tcg_constant_tl(0xff));
}
gen_set_label(label1);
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index c087f183d0..6b004b6248 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -239,7 +239,7 @@ static int read_packet_words(CPUHexagonState *env, DisasContext *ctx,
return nwords;
}
-static bool check_for_attrib(Packet *pkt, int attrib)
+static G_GNUC_UNUSED bool check_for_attrib(Packet *pkt, int attrib)
{
for (int i = 0; i < pkt->num_insns; i++) {
if (GET_ATTRIB(pkt->insn[i].opcode, attrib)) {
@@ -262,11 +262,6 @@ static bool need_slot_cancelled(Packet *pkt)
return false;
}
-static bool need_pred_written(Packet *pkt)
-{
- return check_for_attrib(pkt, A_WRITES_PRED_REG);
-}
-
static bool need_next_PC(DisasContext *ctx)
{
Packet *pkt = ctx->pkt;
@@ -414,7 +409,7 @@ static void gen_start_packet(DisasContext *ctx)
tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], next_PC);
}
}
- if (need_pred_written(pkt)) {
+ if (HEX_DEBUG) {
tcg_gen_movi_tl(hex_pred_written, 0);
}
@@ -428,6 +423,17 @@ static void gen_start_packet(DisasContext *ctx)
}
}
+ /*
+ * Preload the predicated pred registers into hex_new_pred_value[pred_num]
+ * Only endloop instructions conditionally write to pred registers
+ */
+ if (pkt->pkt_has_endloop) {
+ for (int i = 0; i < ctx->preg_log_idx; i++) {
+ int pred_num = ctx->preg_log[i];
+ tcg_gen_mov_tl(hex_new_pred_value[pred_num], hex_pred[pred_num]);
+ }
+ }
+
/* Preload the predicated HVX registers into future_VRegs and tmp_VRegs */
if (!bitmap_empty(ctx->predicated_future_vregs, NUM_VREGS)) {
int i = find_first_bit(ctx->predicated_future_vregs, NUM_VREGS);
@@ -532,41 +538,14 @@ static void gen_reg_writes(DisasContext *ctx)
static void gen_pred_writes(DisasContext *ctx)
{
- int i;
-
/* Early exit if the log is empty */
if (!ctx->preg_log_idx) {
return;
}
- /*
- * Only endloop instructions will conditionally
- * write a predicate. If there are no endloop
- * instructions, we can use the non-conditional
- * write of the predicates.
- */
- if (ctx->pkt->pkt_has_endloop) {
- TCGv zero = tcg_constant_tl(0);
- TCGv pred_written = tcg_temp_new();
- for (i = 0; i < ctx->preg_log_idx; i++) {
- int pred_num = ctx->preg_log[i];
-
- tcg_gen_andi_tl(pred_written, hex_pred_written, 1 << pred_num);
- tcg_gen_movcond_tl(TCG_COND_NE, hex_pred[pred_num],
- pred_written, zero,
- hex_new_pred_value[pred_num],
- hex_pred[pred_num]);
- }
- } else {
- for (i = 0; i < ctx->preg_log_idx; i++) {
- int pred_num = ctx->preg_log[i];
- tcg_gen_mov_tl(hex_pred[pred_num], hex_new_pred_value[pred_num]);
- if (HEX_DEBUG) {
- /* Do this so HELPER(debug_commit_end) will know */
- tcg_gen_ori_tl(hex_pred_written, hex_pred_written,
- 1 << pred_num);
- }
- }
+ for (int i = 0; i < ctx->preg_log_idx; i++) {
+ int pred_num = ctx->preg_log[i];
+ tcg_gen_mov_tl(hex_pred[pred_num], hex_new_pred_value[pred_num]);
}
}
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 09/21] Hexagon (target/hexagon) Don't overlap dest writes with source reads
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (7 preceding siblings ...)
2023-04-27 22:59 ` [PATCH v2 08/21] Hexagon (target/hexagon) Clean up pred_written usage Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 10/21] Hexagon (target/hexagon) Mark registers as read during packet analysis Taylor Simpson
` (11 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
When generating TCG, make sure we have read all the operand registers
before writing to the destination registers.
This is a prerequesite for short-circuiting where the source and dest
operands could be the same.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/genptr.c | 45 ++++++++++++++++++++++++++---------------
1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 2014a8068a..aff9ffe37b 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -971,6 +971,7 @@ static void gen_cmpi_jumpnv(DisasContext *ctx,
/* Shift left with saturation */
static void gen_shl_sat(DisasContext *ctx, TCGv dst, TCGv src, TCGv shift_amt)
{
+ TCGv tmp = tcg_temp_new(); /* In case dst == src */
TCGv usr = get_result_gpr(ctx, HEX_REG_USR);
TCGv sh32 = tcg_temp_new();
TCGv dst_sar = tcg_temp_new();
@@ -995,17 +996,17 @@ static void gen_shl_sat(DisasContext *ctx, TCGv dst, TCGv src, TCGv shift_amt)
*/
tcg_gen_andi_tl(sh32, shift_amt, 31);
- tcg_gen_movcond_tl(TCG_COND_EQ, dst, sh32, shift_amt,
+ tcg_gen_movcond_tl(TCG_COND_EQ, tmp, sh32, shift_amt,
src, tcg_constant_tl(0));
- tcg_gen_shl_tl(dst, dst, sh32);
- tcg_gen_sar_tl(dst_sar, dst, sh32);
+ tcg_gen_shl_tl(tmp, tmp, sh32);
+ tcg_gen_sar_tl(dst_sar, tmp, sh32);
tcg_gen_movcond_tl(TCG_COND_LT, satval, src, tcg_constant_tl(0), min, max);
tcg_gen_setcond_tl(TCG_COND_NE, ovf, dst_sar, src);
tcg_gen_shli_tl(ovf, ovf, reg_field_info[USR_OVF].offset);
tcg_gen_or_tl(usr, usr, ovf);
- tcg_gen_movcond_tl(TCG_COND_EQ, dst, dst_sar, src, dst, satval);
+ tcg_gen_movcond_tl(TCG_COND_EQ, dst, dst_sar, src, tmp, satval);
}
static void gen_sar(TCGv dst, TCGv src, TCGv shift_amt)
@@ -1228,22 +1229,28 @@ void gen_sat_i32(TCGv dest, TCGv source, int width)
void gen_sat_i32_ovfl(TCGv ovfl, TCGv dest, TCGv source, int width)
{
- gen_sat_i32(dest, source, width);
- tcg_gen_setcond_tl(TCG_COND_NE, ovfl, source, dest);
+ TCGv tmp = tcg_temp_new(); /* In case dest == source */
+ gen_sat_i32(tmp, source, width);
+ tcg_gen_setcond_tl(TCG_COND_NE, ovfl, source, tmp);
+ tcg_gen_mov_tl(dest, tmp);
}
void gen_satu_i32(TCGv dest, TCGv source, int width)
{
+ TCGv tmp = tcg_temp_new(); /* In case dest == source */
TCGv max_val = tcg_constant_tl((1 << width) - 1);
TCGv zero = tcg_constant_tl(0);
- tcg_gen_movcond_tl(TCG_COND_GTU, dest, source, max_val, max_val, source);
- tcg_gen_movcond_tl(TCG_COND_LT, dest, source, zero, zero, dest);
+ tcg_gen_movcond_tl(TCG_COND_GTU, tmp, source, max_val, max_val, source);
+ tcg_gen_movcond_tl(TCG_COND_LT, tmp, source, zero, zero, tmp);
+ tcg_gen_mov_tl(dest, tmp);
}
void gen_satu_i32_ovfl(TCGv ovfl, TCGv dest, TCGv source, int width)
{
- gen_satu_i32(dest, source, width);
- tcg_gen_setcond_tl(TCG_COND_NE, ovfl, source, dest);
+ TCGv tmp = tcg_temp_new(); /* In case dest == source */
+ gen_satu_i32(tmp, source, width);
+ tcg_gen_setcond_tl(TCG_COND_NE, ovfl, source, tmp);
+ tcg_gen_mov_tl(dest, tmp);
}
void gen_sat_i64(TCGv_i64 dest, TCGv_i64 source, int width)
@@ -1256,27 +1263,33 @@ void gen_sat_i64(TCGv_i64 dest, TCGv_i64 source, int width)
void gen_sat_i64_ovfl(TCGv ovfl, TCGv_i64 dest, TCGv_i64 source, int width)
{
+ TCGv_i64 tmp = tcg_temp_new_i64(); /* In case dest == source */
TCGv_i64 ovfl_64;
- gen_sat_i64(dest, source, width);
+ gen_sat_i64(tmp, source, width);
ovfl_64 = tcg_temp_new_i64();
- tcg_gen_setcond_i64(TCG_COND_NE, ovfl_64, dest, source);
+ tcg_gen_setcond_i64(TCG_COND_NE, ovfl_64, tmp, source);
+ tcg_gen_mov_i64(dest, tmp);
tcg_gen_trunc_i64_tl(ovfl, ovfl_64);
}
void gen_satu_i64(TCGv_i64 dest, TCGv_i64 source, int width)
{
+ TCGv_i64 tmp = tcg_temp_new_i64(); /* In case dest == source */
TCGv_i64 max_val = tcg_constant_i64((1LL << width) - 1LL);
TCGv_i64 zero = tcg_constant_i64(0);
- tcg_gen_movcond_i64(TCG_COND_GTU, dest, source, max_val, max_val, source);
- tcg_gen_movcond_i64(TCG_COND_LT, dest, source, zero, zero, dest);
+ tcg_gen_movcond_i64(TCG_COND_GTU, tmp, source, max_val, max_val, source);
+ tcg_gen_movcond_i64(TCG_COND_LT, tmp, source, zero, zero, tmp);
+ tcg_gen_mov_i64(dest, tmp);
}
void gen_satu_i64_ovfl(TCGv ovfl, TCGv_i64 dest, TCGv_i64 source, int width)
{
+ TCGv_i64 tmp = tcg_temp_new_i64(); /* In case dest == source */
TCGv_i64 ovfl_64;
- gen_satu_i64(dest, source, width);
+ gen_satu_i64(tmp, source, width);
ovfl_64 = tcg_temp_new_i64();
- tcg_gen_setcond_i64(TCG_COND_NE, ovfl_64, dest, source);
+ tcg_gen_setcond_i64(TCG_COND_NE, ovfl_64, tmp, source);
+ tcg_gen_mov_i64(dest, tmp);
tcg_gen_trunc_i64_tl(ovfl, ovfl_64);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 10/21] Hexagon (target/hexagon) Mark registers as read during packet analysis
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (8 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 09/21] Hexagon (target/hexagon) Don't overlap dest writes with source reads Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet register writes Taylor Simpson
` (10 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
Have gen_analyze_funcs mark the registers that are read by the
instruction. We also mark the implicit reads using instruction
attributes.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/translate.h | 36 +++++++++++++++++++++++
target/hexagon/attribs_def.h.inc | 6 +++-
target/hexagon/translate.c | 20 +++++++++++++
target/hexagon/gen_analyze_funcs.py | 44 ++++++++++++++++++++---------
target/hexagon/hex_common.py | 6 ++++
5 files changed, 97 insertions(+), 15 deletions(-)
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 4b9f21c41d..f72228859f 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -38,10 +38,12 @@ typedef struct DisasContext {
int reg_log[REG_WRITES_MAX];
int reg_log_idx;
DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS);
+ DECLARE_BITMAP(regs_read, TOTAL_PER_THREAD_REGS);
DECLARE_BITMAP(predicated_regs, TOTAL_PER_THREAD_REGS);
int preg_log[PRED_WRITES_MAX];
int preg_log_idx;
DECLARE_BITMAP(pregs_written, NUM_PREGS);
+ DECLARE_BITMAP(pregs_read, NUM_PREGS);
uint8_t store_width[STORES_MAX];
bool s1_store_processed;
int future_vregs_idx;
@@ -55,8 +57,10 @@ typedef struct DisasContext {
DECLARE_BITMAP(vregs_select, NUM_VREGS);
DECLARE_BITMAP(predicated_future_vregs, NUM_VREGS);
DECLARE_BITMAP(predicated_tmp_vregs, NUM_VREGS);
+ DECLARE_BITMAP(vregs_read, NUM_VREGS);
int qreg_log[NUM_QREGS];
int qreg_log_idx;
+ DECLARE_BITMAP(qregs_read, NUM_QREGS);
bool pre_commit;
TCGCond branch_cond;
target_ulong branch_dest;
@@ -73,6 +77,11 @@ static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
}
}
+static inline void ctx_log_pred_read(DisasContext *ctx, int pnum)
+{
+ set_bit(pnum, ctx->pregs_read);
+}
+
static inline void ctx_log_reg_write(DisasContext *ctx, int rnum,
bool is_predicated)
{
@@ -99,6 +108,17 @@ static inline void ctx_log_reg_write_pair(DisasContext *ctx, int rnum,
ctx_log_reg_write(ctx, rnum + 1, is_predicated);
}
+static inline void ctx_log_reg_read(DisasContext *ctx, int rnum)
+{
+ set_bit(rnum, ctx->regs_read);
+}
+
+static inline void ctx_log_reg_read_pair(DisasContext *ctx, int rnum)
+{
+ ctx_log_reg_read(ctx, rnum);
+ ctx_log_reg_read(ctx, rnum + 1);
+}
+
intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum,
int num, bool alloc_ok);
intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum,
@@ -139,6 +159,17 @@ static inline void ctx_log_vreg_write_pair(DisasContext *ctx,
ctx_log_vreg_write(ctx, rnum ^ 1, type, is_predicated);
}
+static inline void ctx_log_vreg_read(DisasContext *ctx, int rnum)
+{
+ set_bit(rnum, ctx->vregs_read);
+}
+
+static inline void ctx_log_vreg_read_pair(DisasContext *ctx, int rnum)
+{
+ ctx_log_vreg_read(ctx, rnum ^ 0);
+ ctx_log_vreg_read(ctx, rnum ^ 1);
+}
+
static inline void ctx_log_qreg_write(DisasContext *ctx,
int rnum)
{
@@ -146,6 +177,11 @@ static inline void ctx_log_qreg_write(DisasContext *ctx,
ctx->qreg_log_idx++;
}
+static inline void ctx_log_qreg_read(DisasContext *ctx, int qnum)
+{
+ set_bit(qnum, ctx->qregs_read);
+}
+
extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
extern TCGv hex_pred[NUM_PREGS];
extern TCGv hex_this_PC;
diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 9874d1658f..17f86e1c32 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -102,6 +102,10 @@ DEF_ATTRIB(IMPLICIT_WRITES_P1, "Writes Predicate 1", "", "UREG.P1")
DEF_ATTRIB(IMPLICIT_WRITES_P2, "Writes Predicate 1", "", "UREG.P2")
DEF_ATTRIB(IMPLICIT_WRITES_P3, "May write Predicate 3", "", "UREG.P3")
DEF_ATTRIB(IMPLICIT_READS_PC, "Reads the PC register", "", "")
+DEF_ATTRIB(IMPLICIT_READS_P0, "Reads the P0 register", "", "")
+DEF_ATTRIB(IMPLICIT_READS_P1, "Reads the P1 register", "", "")
+DEF_ATTRIB(IMPLICIT_READS_P2, "Reads the P2 register", "", "")
+DEF_ATTRIB(IMPLICIT_READS_P3, "Reads the P3 register", "", "")
DEF_ATTRIB(IMPLICIT_WRITES_USR, "May write USR", "", "")
DEF_ATTRIB(WRITES_PRED_REG, "Writes a predicate register", "", "")
DEF_ATTRIB(COMMUTES, "The operation is communitive", "", "")
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 6b004b6248..023fc9be1e 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -336,6 +336,21 @@ static void mark_implicit_pred_writes(DisasContext *ctx)
mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3);
}
+static void mark_implicit_pred_read(DisasContext *ctx, int attrib, int pnum)
+{
+ if (GET_ATTRIB(ctx->insn->opcode, attrib)) {
+ ctx_log_pred_read(ctx, pnum);
+ }
+}
+
+static void mark_implicit_pred_reads(DisasContext *ctx)
+{
+ mark_implicit_pred_read(ctx, A_IMPLICIT_READS_P0, 0);
+ mark_implicit_pred_read(ctx, A_IMPLICIT_READS_P1, 1);
+ mark_implicit_pred_read(ctx, A_IMPLICIT_READS_P3, 2);
+ mark_implicit_pred_read(ctx, A_IMPLICIT_READS_P3, 3);
+}
+
static void analyze_packet(DisasContext *ctx)
{
Packet *pkt = ctx->pkt;
@@ -348,6 +363,7 @@ static void analyze_packet(DisasContext *ctx)
}
mark_implicit_reg_writes(ctx);
mark_implicit_pred_writes(ctx);
+ mark_implicit_pred_reads(ctx);
}
}
@@ -361,9 +377,11 @@ static void gen_start_packet(DisasContext *ctx)
ctx->next_PC = next_PC;
ctx->reg_log_idx = 0;
bitmap_zero(ctx->regs_written, TOTAL_PER_THREAD_REGS);
+ bitmap_zero(ctx->regs_read, TOTAL_PER_THREAD_REGS);
bitmap_zero(ctx->predicated_regs, TOTAL_PER_THREAD_REGS);
ctx->preg_log_idx = 0;
bitmap_zero(ctx->pregs_written, NUM_PREGS);
+ bitmap_zero(ctx->pregs_read, NUM_PREGS);
ctx->future_vregs_idx = 0;
ctx->tmp_vregs_idx = 0;
ctx->vreg_log_idx = 0;
@@ -372,6 +390,8 @@ static void gen_start_packet(DisasContext *ctx)
bitmap_zero(ctx->vregs_select, NUM_VREGS);
bitmap_zero(ctx->predicated_future_vregs, NUM_VREGS);
bitmap_zero(ctx->predicated_tmp_vregs, NUM_VREGS);
+ bitmap_zero(ctx->vregs_read, NUM_VREGS);
+ bitmap_zero(ctx->qregs_read, NUM_QREGS);
ctx->qreg_log_idx = 0;
for (i = 0; i < STORES_MAX; i++) {
ctx->store_width[i] = 0;
diff --git a/target/hexagon/gen_analyze_funcs.py b/target/hexagon/gen_analyze_funcs.py
index c74443da78..86aec5ac4b 100755
--- a/target/hexagon/gen_analyze_funcs.py
+++ b/target/hexagon/gen_analyze_funcs.py
@@ -35,12 +35,14 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
predicated = "true" if is_predicated(tag) else "false"
if regtype == "R":
if regid in {"ss", "tt"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_reg_read_pair(ctx, {regN});\n")
elif regid in {"dd", "ee", "xx", "yy"}:
f.write(f" const int {regN} = insn->regno[{regno}];\n")
f.write(f" ctx_log_reg_write_pair(ctx, {regN}, {predicated});\n")
elif regid in {"s", "t", "u", "v"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_reg_read(ctx, {regN});\n")
elif regid in {"d", "e", "x", "y"}:
f.write(f" const int {regN} = insn->regno[{regno}];\n")
f.write(f" ctx_log_reg_write(ctx, {regN}, {predicated});\n")
@@ -48,7 +50,8 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
print("Bad register parse: ", regtype, regid)
elif regtype == "P":
if regid in {"s", "t", "u", "v"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_pred_read(ctx, {regN});\n")
elif regid in {"d", "e", "x"}:
f.write(f" const int {regN} = insn->regno[{regno}];\n")
f.write(f" ctx_log_pred_write(ctx, {regN});\n")
@@ -57,15 +60,19 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
elif regtype == "C":
if regid == "ss":
f.write(
- f"// const int {regN} = insn->regno[{regno}] " "+ HEX_REG_SA0;\n"
+ f" const int {regN} = insn->regno[{regno}] "
+ "+ HEX_REG_SA0;\n"
)
+ f.write(f" ctx_log_reg_read_pair(ctx, {regN});\n")
elif regid == "dd":
f.write(f" const int {regN} = insn->regno[{regno}] " "+ HEX_REG_SA0;\n")
f.write(f" ctx_log_reg_write_pair(ctx, {regN}, {predicated});\n")
elif regid == "s":
f.write(
- f"// const int {regN} = insn->regno[{regno}] " "+ HEX_REG_SA0;\n"
+ f" const int {regN} = insn->regno[{regno}] "
+ "+ HEX_REG_SA0;\n"
)
+ f.write(f" ctx_log_reg_read(ctx, {regN});\n")
elif regid == "d":
f.write(f" const int {regN} = insn->regno[{regno}] " "+ HEX_REG_SA0;\n")
f.write(f" ctx_log_reg_write(ctx, {regN}, {predicated});\n")
@@ -73,7 +80,8 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
print("Bad register parse: ", regtype, regid)
elif regtype == "M":
if regid == "u":
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_reg_read(ctx, {regN});\n")
else:
print("Bad register parse: ", regtype, regid)
elif regtype == "V":
@@ -88,9 +96,11 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
f" ctx_log_vreg_write_pair(ctx, {regN}, {newv}, " f"{predicated});\n"
)
elif regid in {"uu", "vv"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_vreg_read_pair(ctx, {regN});\n")
elif regid in {"s", "u", "v", "w"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_vreg_read(ctx, {regN});\n")
elif regid in {"d", "x", "y"}:
f.write(f" const int {regN} = insn->regno[{regno}];\n")
f.write(f" ctx_log_vreg_write(ctx, {regN}, {newv}, " f"{predicated});\n")
@@ -101,7 +111,8 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
f.write(f" const int {regN} = insn->regno[{regno}];\n")
f.write(f" ctx_log_qreg_write(ctx, {regN});\n")
elif regid in {"s", "t", "u", "v"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_qreg_read(ctx, {regN});\n")
else:
print("Bad register parse: ", regtype, regid)
elif regtype == "G":
@@ -134,17 +145,20 @@ def analyze_opn_new(f, tag, regtype, regid, regno):
regN = f"{regtype}{regid}N"
if regtype == "N":
if regid in {"s", "t"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_reg_read(ctx, {regN});\n")
else:
print("Bad register parse: ", regtype, regid)
elif regtype == "P":
if regid in {"t", "u", "v"}:
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_pred_read(ctx, {regN});\n")
else:
print("Bad register parse: ", regtype, regid)
elif regtype == "O":
if regid == "s":
- f.write(f"// const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" const int {regN} = insn->regno[{regno}];\n")
+ f.write(f" ctx_log_vreg_read(ctx, {regN});\n")
else:
print("Bad register parse: ", regtype, regid)
else:
@@ -174,8 +188,10 @@ def analyze_opn(f, tag, regtype, regid, toss, numregs, i):
## Insn *insn G_GNUC_UNUSED = ctx->insn;
## const int RdN = insn->regno[0];
## ctx_log_reg_write(ctx, RdN, false);
-## // const int RsN = insn->regno[1];
-## // const int RtN = insn->regno[2];
+## const int RsN = insn->regno[1];
+## ctx_log_reg_read(ctx, RsN);
+## const int RtN = insn->regno[2];
+## ctx_log_reg_read(ctx, RtN);
## }
##
def gen_analyze_func(f, tag, regs, imms):
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index 40f28ca933..232c6e2c20 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -97,6 +97,12 @@ def calculate_attribs():
add_qemu_macro_attrib("fSET_LPCFG", "A_IMPLICIT_WRITES_USR")
add_qemu_macro_attrib("fLOAD", "A_SCALAR_LOAD")
add_qemu_macro_attrib("fSTORE", "A_SCALAR_STORE")
+ add_qemu_macro_attrib('fLSBNEW0', 'A_IMPLICIT_READS_P0')
+ add_qemu_macro_attrib('fLSBNEW0NOT', 'A_IMPLICIT_READS_P0')
+ add_qemu_macro_attrib('fREAD_P0', 'A_IMPLICIT_READS_P0')
+ add_qemu_macro_attrib('fLSBNEW1', 'A_IMPLICIT_READS_P1')
+ add_qemu_macro_attrib('fLSBNEW1NOT', 'A_IMPLICIT_READS_P1')
+ add_qemu_macro_attrib('fREAD_P3', 'A_IMPLICIT_READS_P3')
# Recurse down macros, find attributes from sub-macros
macroValues = list(macros.values())
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet register writes
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (9 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 10/21] Hexagon (target/hexagon) Mark registers as read during packet analysis Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-28 2:47 ` Brian Cain
2023-04-27 23:00 ` [PATCH v2 12/21] Hexagon (target/hexagon) Short-circuit packet predicate writes Taylor Simpson
` (9 subsequent siblings)
20 siblings, 1 reply; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
In certain cases, we can avoid the overhead of writing to hex_new_value
and write directly to hex_gpr. We add need_commit field to DisasContext
indicating if the end-of-packet commit is needed. If it is not needed,
get_result_gpr() and get_result_gpr_pair() can return hex_gpr.
We pass the ctx->need_commit to helpers when needed.
Finally, we can early-exit from gen_reg_writes during packet commit.
There are a few instructions whose semantics write to the result before
reading all the inputs. Therefore, the idef-parser generated code is
incompatible with short-circuit. We tell idef-parser to skip them.
For debugging purposes, we add a cpu property to turn off short-circuit.
When the short-circuit property is false, we skip the analysis and force
the end-of-packet commit.
Here's a simple example of the TCG generated for
0x004000b4: 0x7800c020 { R0 = #0x1 }
BEFORE:
---- 004000b4
movi_i32 new_r0,$0x1
mov_i32 r0,new_r0
AFTER:
---- 004000b4
movi_i32 r0,$0x1
This patch reintroduces a use of check_for_attrib, so we remove the
G_GNUC_UNUSED added earlier in this series.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 1 +
target/hexagon/gen_tcg.h | 3 +-
target/hexagon/genptr.h | 2 +
target/hexagon/helper.h | 2 +-
target/hexagon/macros.h | 13 ++++-
target/hexagon/translate.h | 2 +
target/hexagon/arch.c | 3 +-
target/hexagon/cpu.c | 5 +-
target/hexagon/genptr.c | 30 ++++-------
target/hexagon/op_helper.c | 5 +-
target/hexagon/translate.c | 67 ++++++++++++++++++++++++-
target/hexagon/gen_helper_funcs.py | 2 +
target/hexagon/gen_helper_protos.py | 10 +++-
target/hexagon/gen_idef_parser_funcs.py | 7 +++
target/hexagon/gen_tcg_funcs.py | 5 ++
target/hexagon/hex_common.py | 3 ++
16 files changed, 129 insertions(+), 31 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 81b663ecfb..9252055a38 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -146,6 +146,7 @@ struct ArchCPU {
bool lldb_compat;
target_ulong lldb_stack_adjust;
+ bool short_circuit;
};
#include "cpu_bits.h"
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 2b2a6175a5..1f7e535300 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -592,7 +592,8 @@
#define fGEN_TCG_A5_ACS(SHORTCODE) \
do { \
gen_helper_vacsh_pred(PeV, cpu_env, RxxV, RssV, RttV); \
- gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV); \
+ gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV, \
+ tcg_constant_tl(ctx->need_commit)); \
} while (0)
#define fGEN_TCG_S2_cabacdecbin(SHORTCODE) \
diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
index 75d0fc262d..420867f934 100644
--- a/target/hexagon/genptr.h
+++ b/target/hexagon/genptr.h
@@ -58,4 +58,6 @@ void gen_set_half(int N, TCGv result, TCGv src);
void gen_set_half_i64(int N, TCGv_i64 result, TCGv src);
void probe_noshuf_load(TCGv va, int s, int mi);
+extern const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS];
+
#endif
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index 73849e3d49..4b750d0351 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -29,7 +29,7 @@ DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32)
DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
DEF_HELPER_2(sfinvsqrta, i64, env, f32)
-DEF_HELPER_4(vacsh_val, s64, env, s64, s64, s64)
+DEF_HELPER_5(vacsh_val, s64, env, s64, s64, s64, i32)
DEF_HELPER_FLAGS_4(vacsh_pred, TCG_CALL_NO_RWG_SE, s32, env, s64, s64, s64)
DEF_HELPER_FLAGS_2(cabacdecbin_val, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(cabacdecbin_pred, TCG_CALL_NO_RWG_SE, s32, s64, s64)
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 16e72ed0d5..a68446a367 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -44,8 +44,17 @@
reg_field_info[FIELD].offset)
#define SET_USR_FIELD(FIELD, VAL) \
- fINSERT_BITS(env->new_value[HEX_REG_USR], reg_field_info[FIELD].width, \
- reg_field_info[FIELD].offset, (VAL))
+ do { \
+ if (pkt_need_commit) { \
+ fINSERT_BITS(env->new_value[HEX_REG_USR], \
+ reg_field_info[FIELD].width, \
+ reg_field_info[FIELD].offset, (VAL)); \
+ } else { \
+ fINSERT_BITS(env->gpr[HEX_REG_USR], \
+ reg_field_info[FIELD].width, \
+ reg_field_info[FIELD].offset, (VAL)); \
+ } \
+ } while (0)
#endif
#ifdef QEMU_GENERATE
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index f72228859f..3f6fd3452c 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -62,10 +62,12 @@ typedef struct DisasContext {
int qreg_log_idx;
DECLARE_BITMAP(qregs_read, NUM_QREGS);
bool pre_commit;
+ bool need_commit;
TCGCond branch_cond;
target_ulong branch_dest;
bool is_tight_loop;
bool need_pkt_has_store_s1;
+ bool short_circuit;
} DisasContext;
static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index da79b41c4d..d053d68487 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -224,6 +224,7 @@ void arch_fpop_start(CPUHexagonState *env)
void arch_fpop_end(CPUHexagonState *env)
{
+ const bool pkt_need_commit = true;
int flags = get_float_exception_flags(&env->fp_status);
if (flags != 0) {
SOFTFLOAT_TEST_FLAG(float_flag_inexact, FPINPF, FPINPE);
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index ab40cfc283..4adf90dcfa 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -52,6 +52,8 @@ static Property hexagon_lldb_compat_property =
static Property hexagon_lldb_stack_adjust_property =
DEFINE_PROP_UNSIGNED("lldb-stack-adjust", HexagonCPU, lldb_stack_adjust,
0, qdev_prop_uint32, target_ulong);
+static Property hexagon_short_circuit_property =
+ DEFINE_PROP_BOOL("short-circuit", HexagonCPU, short_circuit, true);
const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = {
"r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
@@ -328,6 +330,7 @@ static void hexagon_cpu_init(Object *obj)
cpu_set_cpustate_pointers(cpu);
qdev_property_add_static(DEVICE(obj), &hexagon_lldb_compat_property);
qdev_property_add_static(DEVICE(obj), &hexagon_lldb_stack_adjust_property);
+ qdev_property_add_static(DEVICE(obj), &hexagon_short_circuit_property);
}
#include "hw/core/tcg-cpu-ops.h"
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index aff9ffe37b..5a0f6b5195 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -45,7 +45,7 @@ TCGv gen_read_preg(TCGv pred, uint8_t num)
#define IMMUTABLE (~0)
-static const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] = {
+const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] = {
[HEX_REG_USR] = 0xc13000c0,
[HEX_REG_PC] = IMMUTABLE,
[HEX_REG_GP] = 0x3f,
@@ -70,14 +70,18 @@ static inline void gen_masked_reg_write(TCGv new_val, TCGv cur_val,
static TCGv get_result_gpr(DisasContext *ctx, int rnum)
{
- return hex_new_value[rnum];
+ if (ctx->need_commit) {
+ return hex_new_value[rnum];
+ } else {
+ return hex_gpr[rnum];
+ }
}
static TCGv_i64 get_result_gpr_pair(DisasContext *ctx, int rnum)
{
TCGv_i64 result = tcg_temp_new_i64();
- tcg_gen_concat_i32_i64(result, hex_new_value[rnum],
- hex_new_value[rnum + 1]);
+ tcg_gen_concat_i32_i64(result, get_result_gpr(ctx, rnum),
+ get_result_gpr(ctx, rnum + 1));
return result;
}
@@ -86,7 +90,7 @@ void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val)
const target_ulong reg_mask = reg_immut_masks[rnum];
gen_masked_reg_write(val, hex_gpr[rnum], reg_mask);
- tcg_gen_mov_tl(hex_new_value[rnum], val);
+ tcg_gen_mov_tl(get_result_gpr(ctx, rnum), val);
if (HEX_DEBUG) {
/* Do this so HELPER(debug_commit_end) will know */
tcg_gen_movi_tl(hex_reg_written[rnum], 1);
@@ -95,27 +99,15 @@ void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val)
static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 val)
{
- const target_ulong reg_mask_low = reg_immut_masks[rnum];
- const target_ulong reg_mask_high = reg_immut_masks[rnum + 1];
TCGv val32 = tcg_temp_new();
/* Low word */
tcg_gen_extrl_i64_i32(val32, val);
- gen_masked_reg_write(val32, hex_gpr[rnum], reg_mask_low);
- tcg_gen_mov_tl(hex_new_value[rnum], val32);
- if (HEX_DEBUG) {
- /* Do this so HELPER(debug_commit_end) will know */
- tcg_gen_movi_tl(hex_reg_written[rnum], 1);
- }
+ gen_log_reg_write(ctx, rnum, val32);
/* High word */
tcg_gen_extrh_i64_i32(val32, val);
- gen_masked_reg_write(val32, hex_gpr[rnum + 1], reg_mask_high);
- tcg_gen_mov_tl(hex_new_value[rnum + 1], val32);
- if (HEX_DEBUG) {
- /* Do this so HELPER(debug_commit_end) will know */
- tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1);
- }
+ gen_log_reg_write(ctx, rnum + 1, val32);
}
void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 46ccc59106..fc5c30a141 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -220,7 +220,7 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1)
reg_printed = true;
}
HEX_DEBUG_LOG("\tr%d = " TARGET_FMT_ld " (0x" TARGET_FMT_lx ")\n",
- i, env->new_value[i], env->new_value[i]);
+ i, env->gpr[i], env->gpr[i]);
}
}
@@ -352,7 +352,8 @@ uint64_t HELPER(sfinvsqrta)(CPUHexagonState *env, float32 RsV)
}
int64_t HELPER(vacsh_val)(CPUHexagonState *env,
- int64_t RxxV, int64_t RssV, int64_t RttV)
+ int64_t RxxV, int64_t RssV, int64_t RttV,
+ uint32_t pkt_need_commit)
{
for (int i = 0; i < 4; i++) {
int xv = sextract64(RxxV, i * 16, 16);
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 023fc9be1e..5bd71bdcaf 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -27,6 +27,7 @@
#include "insn.h"
#include "decode.h"
#include "translate.h"
+#include "genptr.h"
#include "printinsn.h"
#include "analyze_funcs_generated.c.inc"
@@ -239,7 +240,7 @@ static int read_packet_words(CPUHexagonState *env, DisasContext *ctx,
return nwords;
}
-static G_GNUC_UNUSED bool check_for_attrib(Packet *pkt, int attrib)
+static bool check_for_attrib(Packet *pkt, int attrib)
{
for (int i = 0; i < pkt->num_insns; i++) {
if (GET_ATTRIB(pkt->insn[i].opcode, attrib)) {
@@ -336,6 +337,58 @@ static void mark_implicit_pred_writes(DisasContext *ctx)
mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3);
}
+static bool pkt_raises_exception(Packet *pkt)
+{
+ if (check_for_attrib(pkt, A_LOAD) ||
+ check_for_attrib(pkt, A_STORE)) {
+ return true;
+ }
+ return false;
+}
+
+static bool need_commit(DisasContext *ctx)
+{
+ Packet *pkt = ctx->pkt;
+
+ /*
+ * If the short-circuit property is set to false, we'll always do the commit
+ */
+ if (!ctx->short_circuit) {
+ return true;
+ }
+
+ if (pkt_raises_exception(pkt)) {
+ return true;
+ }
+
+ /* Registers with immutability flags require new_value */
+ for (int i = 0; i < ctx->reg_log_idx; i++) {
+ int rnum = ctx->reg_log[i];
+ if (reg_immut_masks[rnum]) {
+ return true;
+ }
+ }
+
+ /* Floating point instructions are hard-coded to use new_value */
+ if (check_for_attrib(pkt, A_FPOP)) {
+ return true;
+ }
+
+ if (pkt->num_insns == 1) {
+ return false;
+ }
+
+ /* Check for overlap between register reads and writes */
+ for (int i = 0; i < ctx->reg_log_idx; i++) {
+ int rnum = ctx->reg_log[i];
+ if (test_bit(rnum, ctx->regs_read)) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
static void mark_implicit_pred_read(DisasContext *ctx, int attrib, int pnum)
{
if (GET_ATTRIB(ctx->insn->opcode, attrib)) {
@@ -365,6 +418,8 @@ static void analyze_packet(DisasContext *ctx)
mark_implicit_pred_writes(ctx);
mark_implicit_pred_reads(ctx);
}
+
+ ctx->need_commit = need_commit(ctx);
}
static void gen_start_packet(DisasContext *ctx)
@@ -434,7 +489,8 @@ static void gen_start_packet(DisasContext *ctx)
}
/* Preload the predicated registers into hex_new_value[i] */
- if (!bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) {
+ if (ctx->need_commit &&
+ !bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) {
int i = find_first_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS);
while (i < TOTAL_PER_THREAD_REGS) {
tcg_gen_mov_tl(hex_new_value[i], hex_gpr[i]);
@@ -541,6 +597,11 @@ static void gen_reg_writes(DisasContext *ctx)
{
int i;
+ /* Early exit if not needed */
+ if (!ctx->need_commit) {
+ return;
+ }
+
for (i = 0; i < ctx->reg_log_idx; i++) {
int reg_num = ctx->reg_log[i];
@@ -919,6 +980,7 @@ static void hexagon_tr_init_disas_context(DisasContextBase *dcbase,
CPUState *cs)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
+ HexagonCPU *hex_cpu = env_archcpu(cs->env_ptr);
uint32_t hex_flags = dcbase->tb->flags;
ctx->mem_idx = MMU_USER_IDX;
@@ -927,6 +989,7 @@ static void hexagon_tr_init_disas_context(DisasContextBase *dcbase,
ctx->num_hvx_insns = 0;
ctx->branch_cond = TCG_COND_NEVER;
ctx->is_tight_loop = FIELD_EX32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP);
+ ctx->short_circuit = hex_cpu->short_circuit;
}
static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu)
diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper_funcs.py
index c73d792580..e259ea3d03 100755
--- a/target/hexagon/gen_helper_funcs.py
+++ b/target/hexagon/gen_helper_funcs.py
@@ -287,6 +287,8 @@ def gen_helper_function(f, tag, tagregs, tagimms):
if hex_common.need_pkt_has_multi_cof(tag):
f.write(", uint32_t pkt_has_multi_cof")
+ if (hex_common.need_pkt_need_commit(tag)):
+ f.write(", uint32_t pkt_need_commit")
if hex_common.need_PC(tag):
if i > 0:
diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helper_protos.py
index 187cd6e04e..c5ecb85294 100755
--- a/target/hexagon/gen_helper_protos.py
+++ b/target/hexagon/gen_helper_protos.py
@@ -86,6 +86,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
def_helper_size = len(regs) + len(imms) + numscalarreadwrite + 1
if hex_common.need_pkt_has_multi_cof(tag):
def_helper_size += 1
+ if hex_common.need_pkt_need_commit(tag):
+ def_helper_size += 1
if hex_common.need_part1(tag):
def_helper_size += 1
if hex_common.need_slot(tag):
@@ -103,6 +105,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
def_helper_size = len(regs) + len(imms) + numscalarreadwrite
if hex_common.need_pkt_has_multi_cof(tag):
def_helper_size += 1
+ if hex_common.need_pkt_need_commit(tag):
+ def_helper_size += 1
if hex_common.need_part1(tag):
def_helper_size += 1
if hex_common.need_slot(tag):
@@ -156,10 +160,12 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
for immlett, bits, immshift in imms:
f.write(", s32")
- ## Add the arguments for the instruction pkt_has_multi_cof, slot and
- ## part1 (if needed)
+ ## Add the arguments for the instruction pkt_has_multi_cof,
+ ## pkt_needs_commit, PC, next_PC, slot, and part1 (if needed)
if hex_common.need_pkt_has_multi_cof(tag):
f.write(", i32")
+ if hex_common.need_pkt_need_commit(tag):
+ f.write(', i32')
if hex_common.need_PC(tag):
f.write(", i32")
if hex_common.helper_needs_next_PC(tag):
diff --git a/target/hexagon/gen_idef_parser_funcs.py b/target/hexagon/gen_idef_parser_funcs.py
index afe68bdb6f..b7f2df0f36 100644
--- a/target/hexagon/gen_idef_parser_funcs.py
+++ b/target/hexagon/gen_idef_parser_funcs.py
@@ -109,6 +109,13 @@ def main():
continue
if "A_COF" in hex_common.attribdict[tag]:
continue
+ ## Skip instructions that are incompatible with short-circuit
+ ## packet register writes
+ if ( tag == 'S2_insert' or
+ tag == 'S2_insert_rp' or
+ tag == 'S2_asr_r_svw_trun' or
+ tag == 'A2_swiz' ):
+ continue
regs = tagregs[tag]
imms = tagimms[tag]
diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index d9ccbe63f6..0e45d43685 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -550,6 +550,9 @@ def gen_tcg_func(f, tag, regs, imms):
if hex_common.need_pkt_has_multi_cof(tag):
f.write(" TCGv pkt_has_multi_cof = ")
f.write("tcg_constant_tl(ctx->pkt->pkt_has_multi_cof);\n")
+ if hex_common.need_pkt_need_commit(tag):
+ f.write(" TCGv pkt_need_commit = ")
+ f.write("tcg_constant_tl(ctx->need_commit);\n")
if hex_common.need_part1(tag):
f.write(" TCGv part1 = tcg_constant_tl(insn->part1);\n")
if hex_common.need_slot(tag):
@@ -596,6 +599,8 @@ def gen_tcg_func(f, tag, regs, imms):
if hex_common.need_pkt_has_multi_cof(tag):
f.write(", pkt_has_multi_cof")
+ if hex_common.need_pkt_need_commit(tag):
+ f.write(", pkt_need_commit")
if hex_common.need_PC(tag):
f.write(", PC")
if hex_common.helper_needs_next_PC(tag):
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index 232c6e2c20..29c0508f66 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -276,6 +276,9 @@ def need_pkt_has_multi_cof(tag):
return "A_COF" in attribdict[tag]
+def need_pkt_need_commit(tag):
+ return 'A_IMPLICIT_WRITES_USR' in attribdict[tag]
+
def need_condexec_reg(tag, regs):
if "A_CONDEXEC" in attribdict[tag]:
for regtype, regid, toss, numregs in regs:
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 12/21] Hexagon (target/hexagon) Short-circuit packet predicate writes
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (10 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet register writes Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 13/21] Hexagon (target/hexagon) Short-circuit packet HVX writes Taylor Simpson
` (8 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
In certain cases, we can avoid the overhead of writing to hex_new_pred_value
and write directly to hex_pred. We consider predicate reads/writes when
computing ctx->need_commit. The get_result_pred() function uses this
field to decide between hex_new_pred_value and hex_pred. Then, we can
early-exit from gen_pred_writes.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/genptr.h | 1 +
target/hexagon/genptr.c | 15 ++++++++++++---
target/hexagon/translate.c | 14 +++++++++++---
3 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
index 420867f934..e11ccc2358 100644
--- a/target/hexagon/genptr.h
+++ b/target/hexagon/genptr.h
@@ -35,6 +35,7 @@ void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, uint32_t slot);
void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, uint32_t slot);
TCGv gen_read_reg(TCGv result, int num);
TCGv gen_read_preg(TCGv pred, uint8_t num);
+TCGv get_result_pred(DisasContext *ctx, int pnum);
void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val);
void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val);
void gen_set_usr_field(DisasContext *ctx, int field, TCGv val);
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 5a0f6b5195..33f9d78aed 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -110,8 +110,18 @@ static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 val)
gen_log_reg_write(ctx, rnum + 1, val32);
}
+TCGv get_result_pred(DisasContext *ctx, int pnum)
+{
+ if (ctx->need_commit) {
+ return hex_new_pred_value[pnum];
+ } else {
+ return hex_pred[pnum];
+ }
+}
+
void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
{
+ TCGv pred = get_result_pred(ctx, pnum);
TCGv base_val = tcg_temp_new();
tcg_gen_andi_tl(base_val, val, 0xff);
@@ -124,10 +134,9 @@ void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
* straight assignment. Otherwise, do an and.
*/
if (!test_bit(pnum, ctx->pregs_written)) {
- tcg_gen_mov_tl(hex_new_pred_value[pnum], base_val);
+ tcg_gen_mov_tl(pred, base_val);
} else {
- tcg_gen_and_tl(hex_new_pred_value[pnum],
- hex_new_pred_value[pnum], base_val);
+ tcg_gen_and_tl(pred, pred, base_val);
}
if (HEX_DEBUG) {
tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 5bd71bdcaf..4532b8d05e 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -386,6 +386,14 @@ static bool need_commit(DisasContext *ctx)
}
}
+ /* Check for overlap between predicate reads and writes */
+ for (int i = 0; i < ctx->preg_log_idx; i++) {
+ int pnum = ctx->preg_log[i];
+ if (test_bit(pnum, ctx->pregs_read)) {
+ return true;
+ }
+ }
+
return false;
}
@@ -503,7 +511,7 @@ static void gen_start_packet(DisasContext *ctx)
* Preload the predicated pred registers into hex_new_pred_value[pred_num]
* Only endloop instructions conditionally write to pred registers
*/
- if (pkt->pkt_has_endloop) {
+ if (ctx->need_commit && pkt->pkt_has_endloop) {
for (int i = 0; i < ctx->preg_log_idx; i++) {
int pred_num = ctx->preg_log[i];
tcg_gen_mov_tl(hex_new_pred_value[pred_num], hex_pred[pred_num]);
@@ -619,8 +627,8 @@ static void gen_reg_writes(DisasContext *ctx)
static void gen_pred_writes(DisasContext *ctx)
{
- /* Early exit if the log is empty */
- if (!ctx->preg_log_idx) {
+ /* Early exit if not needed or the log is empty */
+ if (!ctx->need_commit || !ctx->preg_log_idx) {
return;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 13/21] Hexagon (target/hexagon) Short-circuit packet HVX writes
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (11 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 12/21] Hexagon (target/hexagon) Short-circuit packet predicate writes Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 14/21] Hexagon (target/hexagon) Short-circuit more HVX single instruction packets Taylor Simpson
` (7 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
In certain cases, we can avoid the overhead of writing to future_VRegs
and write directly to VRegs. We consider HVX reads/writes when computing
ctx->need_commit. Then, we can early-exit from gen_commit_hvx.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/genptr.c | 6 ++++-
target/hexagon/translate.c | 46 +++++++++++++++++++++++++++++++++++++-
2 files changed, 50 insertions(+), 2 deletions(-)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 33f9d78aed..d134d8082a 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -1104,7 +1104,11 @@ static void gen_log_vreg_write_pair(DisasContext *ctx, intptr_t srcoff, int num,
static intptr_t get_result_qreg(DisasContext *ctx, int qnum)
{
- return offsetof(CPUHexagonState, future_QRegs[qnum]);
+ if (ctx->need_commit) {
+ return offsetof(CPUHexagonState, future_QRegs[qnum]);
+ } else {
+ return offsetof(CPUHexagonState, QRegs[qnum]);
+ }
}
static void gen_vreg_load(DisasContext *ctx, intptr_t dstoff, TCGv src,
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 4532b8d05e..b714a8da96 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -70,6 +70,10 @@ intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum,
{
intptr_t offset;
+ if (!ctx->need_commit) {
+ return offsetof(CPUHexagonState, VRegs[regnum]);
+ }
+
/* See if it is already allocated */
for (int i = 0; i < ctx->future_vregs_idx; i++) {
if (ctx->future_vregs_num[i] == regnum) {
@@ -374,7 +378,7 @@ static bool need_commit(DisasContext *ctx)
return true;
}
- if (pkt->num_insns == 1) {
+ if (pkt->num_insns == 1 && !pkt->pkt_has_hvx) {
return false;
}
@@ -394,6 +398,40 @@ static bool need_commit(DisasContext *ctx)
}
}
+ /* Check for overlap between HVX reads and writes */
+ for (int i = 0; i < ctx->vreg_log_idx; i++) {
+ int vnum = ctx->vreg_log[i];
+ if (test_bit(vnum, ctx->vregs_read)) {
+ return true;
+ }
+ }
+ if (!bitmap_empty(ctx->vregs_updated_tmp, NUM_VREGS)) {
+ int i = find_first_bit(ctx->vregs_updated_tmp, NUM_VREGS);
+ while (i < NUM_VREGS) {
+ if (test_bit(i, ctx->vregs_read)) {
+ return true;
+ }
+ i = find_next_bit(ctx->vregs_updated_tmp, NUM_VREGS, i + 1);
+ }
+ }
+ if (!bitmap_empty(ctx->vregs_select, NUM_VREGS)) {
+ int i = find_first_bit(ctx->vregs_select, NUM_VREGS);
+ while (i < NUM_VREGS) {
+ if (test_bit(i, ctx->vregs_read)) {
+ return true;
+ }
+ i = find_next_bit(ctx->vregs_select, NUM_VREGS, i + 1);
+ }
+ }
+
+ /* Check for overlap between HVX predicate reads and writes */
+ for (int i = 0; i < ctx->qreg_log_idx; i++) {
+ int qnum = ctx->qreg_log[i];
+ if (test_bit(qnum, ctx->qregs_read)) {
+ return true;
+ }
+ }
+
return false;
}
@@ -787,6 +825,12 @@ static void gen_commit_hvx(DisasContext *ctx)
{
int i;
+ /* Early exit if not needed */
+ if (!ctx->need_commit) {
+ g_assert(!pkt_has_hvx_store(ctx->pkt));
+ return;
+ }
+
/*
* for (i = 0; i < ctx->vreg_log_idx; i++) {
* int rnum = ctx->vreg_log[i];
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 14/21] Hexagon (target/hexagon) Short-circuit more HVX single instruction packets
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (12 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 13/21] Hexagon (target/hexagon) Short-circuit packet HVX writes Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-28 8:09 ` Richard Henderson
2023-04-27 23:00 ` [PATCH v2 15/21] Hexagon (target/hexagon) Add overrides for disabled idef-parser insns Taylor Simpson
` (6 subsequent siblings)
20 siblings, 1 reply; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The generated helpers for HVX use pass-by-reference, so they can't
short-circuit when the reads/writes overlap. The instructions with
overrides are OK because they use tcg_gen_gvec_*.
We add a flag has_hvx_helper to DisasContext and extend gen_analyze_funcs
to set the flag when the instruction is an HVX instruction with a
generated helper.
We add an override for V6_vcombine so that it can be short-circuited
along with a test case in tests/tcg/hexagon/hvx_misc.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
target/hexagon/gen_tcg_hvx.h | 23 +++++++++++++++++++++++
target/hexagon/translate.h | 1 +
target/hexagon/translate.c | 17 +++++++++++++++--
tests/tcg/hexagon/hvx_misc.c | 21 +++++++++++++++++++++
target/hexagon/gen_analyze_funcs.py | 5 +++++
5 files changed, 65 insertions(+), 2 deletions(-)
diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h
index d4aefe8e3f..19680d8505 100644
--- a/target/hexagon/gen_tcg_hvx.h
+++ b/target/hexagon/gen_tcg_hvx.h
@@ -128,6 +128,29 @@ static inline void assert_vhist_tmp(DisasContext *ctx)
tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector))
+/*
+ * Vector combine
+ *
+ * Be careful that the source and dest don't overlap
+ */
+#define fGEN_TCG_V6_vcombine(SHORTCODE) \
+ do { \
+ if (VddV_off != VuV_off) { \
+ tcg_gen_gvec_mov(MO_64, VddV_off, VvV_off, \
+ sizeof(MMVector), sizeof(MMVector)); \
+ tcg_gen_gvec_mov(MO_64, VddV_off + sizeof(MMVector), VuV_off, \
+ sizeof(MMVector), sizeof(MMVector)); \
+ } else { \
+ intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
+ tcg_gen_gvec_mov(MO_64, tmpoff, VuV_off, \
+ sizeof(MMVector), sizeof(MMVector)); \
+ tcg_gen_gvec_mov(MO_64, VddV_off, VvV_off, \
+ sizeof(MMVector), sizeof(MMVector)); \
+ tcg_gen_gvec_mov(MO_64, VddV_off + sizeof(MMVector), tmpoff, \
+ sizeof(MMVector), sizeof(MMVector)); \
+ } \
+ } while (0)
+
/* Vector conditional move */
#define fGEN_TCG_VEC_CMOV(PRED) \
do { \
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 3f6fd3452c..26bcae0395 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -68,6 +68,7 @@ typedef struct DisasContext {
bool is_tight_loop;
bool need_pkt_has_store_s1;
bool short_circuit;
+ bool has_hvx_helper;
} DisasContext;
static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index b714a8da96..c7a04e34d2 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -378,8 +378,20 @@ static bool need_commit(DisasContext *ctx)
return true;
}
- if (pkt->num_insns == 1 && !pkt->pkt_has_hvx) {
- return false;
+ if (pkt->num_insns == 1) {
+ if (pkt->pkt_has_hvx) {
+ /*
+ * The HVX instructions with generated helpers use
+ * pass-by-reference, so they need the read/write overlap
+ * check below.
+ * The HVX instructions with overrides are OK.
+ */
+ if (!ctx->has_hvx_helper) {
+ return false;
+ }
+ } else {
+ return false;
+ }
}
/* Check for overlap between register reads and writes */
@@ -454,6 +466,7 @@ static void analyze_packet(DisasContext *ctx)
{
Packet *pkt = ctx->pkt;
ctx->need_pkt_has_store_s1 = false;
+ ctx->has_hvx_helper = false;
for (int i = 0; i < pkt->num_insns; i++) {
Insn *insn = &pkt->insn[i];
ctx->insn = insn;
diff --git a/tests/tcg/hexagon/hvx_misc.c b/tests/tcg/hexagon/hvx_misc.c
index d0e64e035f..c89fe0253d 100644
--- a/tests/tcg/hexagon/hvx_misc.c
+++ b/tests/tcg/hexagon/hvx_misc.c
@@ -454,6 +454,25 @@ static void test_load_cur_predicated(void)
check_output_w(__LINE__, BUFSIZE);
}
+static void test_vcombine(void)
+{
+ for (int i = 0; i < BUFSIZE / 2; i++) {
+ asm volatile("v2 = vsplat(%0)\n\t"
+ "v3 = vsplat(%1)\n\t"
+ "v3:2 = vcombine(v2, v3)\n\t"
+ "vmem(%2+#0) = v2\n\t"
+ "vmem(%2+#1) = v3\n\t"
+ :
+ : "r"(2 * i), "r"(2 * i + 1), "r"(&output[2 * i])
+ : "v2", "v3", "memory");
+ for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
+ expect[2 * i].w[j] = 2 * i + 1;
+ expect[2 * i + 1].w[j] = 2 * i;
+ }
+ }
+ check_output_w(__LINE__, BUFSIZE);
+}
+
int main()
{
init_buffers();
@@ -494,6 +513,8 @@ int main()
test_load_tmp_predicated();
test_load_cur_predicated();
+ test_vcombine();
+
puts(err ? "FAIL" : "PASS");
return err ? 1 : 0;
}
diff --git a/target/hexagon/gen_analyze_funcs.py b/target/hexagon/gen_analyze_funcs.py
index 86aec5ac4b..36da669450 100755
--- a/target/hexagon/gen_analyze_funcs.py
+++ b/target/hexagon/gen_analyze_funcs.py
@@ -212,6 +212,11 @@ def gen_analyze_func(f, tag, regs, imms):
if has_generated_helper and "A_SCALAR_LOAD" in hex_common.attribdict[tag]:
f.write(" ctx->need_pkt_has_store_s1 = true;\n")
+ ## Mark HVX instructions with generated helpers
+ if (has_generated_helper and
+ "A_CVI" in hex_common.attribdict[tag]):
+ f.write(" ctx->has_hvx_helper = true;\n")
+
f.write("}\n\n")
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 15/21] Hexagon (target/hexagon) Add overrides for disabled idef-parser insns
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (13 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 14/21] Hexagon (target/hexagon) Short-circuit more HVX single instruction packets Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 16/21] Hexagon (target/hexagon) Make special new_value for USR Taylor Simpson
` (5 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The following have overrides
S2_insert
S2_insert_rp
S2_asr_r_svw_trun
A2_swiz
These instructions have semantics that write to the destination
before all the operand reads have been completed. Therefore,
the idef-parser versions were disabled with the short-circuit patch.
Test cases added to tests/tcg/hexagon/read_write_overlap.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/gen_tcg.h | 18 ++++
target/hexagon/genptr.c | 99 ++++++++++++++++++
tests/tcg/hexagon/read_write_overlap.c | 136 +++++++++++++++++++++++++
tests/tcg/hexagon/Makefile.target | 1 +
4 files changed, 254 insertions(+)
create mode 100644 tests/tcg/hexagon/read_write_overlap.c
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 1f7e535300..fabc1eb623 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -1181,6 +1181,24 @@
tcg_gen_extrl_i64_i32(RdV, tmp); \
} while (0)
+#define fGEN_TCG_S2_insert(SHORTCODE) \
+ do { \
+ int width = uiV; \
+ int offset = UiV; \
+ if (width != 0) { \
+ if (offset + width > 32) { \
+ width = 32 - offset; \
+ } \
+ tcg_gen_deposit_tl(RxV, RxV, RsV, offset, width); \
+ } \
+ } while (0)
+#define fGEN_TCG_S2_insert_rp(SHORTCODE) \
+ gen_insert_rp(ctx, RxV, RsV, RttV)
+#define fGEN_TCG_S2_asr_r_svw_trun(SHORTCODE) \
+ gen_asr_r_svw_trun(ctx, RdV, RssV, RtV)
+#define fGEN_TCG_A2_swiz(SHORTCODE) \
+ tcg_gen_bswap_tl(RdV, RsV)
+
/* Floating point */
#define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index d134d8082a..0727d4524b 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -1065,6 +1065,105 @@ static void gen_asl_r_r_sat(DisasContext *ctx, TCGv RdV, TCGv RsV, TCGv RtV)
gen_set_label(done);
}
+static void gen_insert_rp(DisasContext *ctx, TCGv RxV, TCGv RsV, TCGv_i64 RttV)
+{
+ /*
+ * int width = fZXTN(6, 32, (fGETWORD(1, RttV)));
+ * int offset = fSXTN(7, 32, (fGETWORD(0, RttV)));
+ * size8u_t mask = ((fCONSTLL(1) << width) - 1);
+ * if (offset < 0) {
+ * RxV = 0;
+ * } else {
+ * RxV &= ~(mask << offset);
+ * RxV |= ((RsV & mask) << offset);
+ * }
+ */
+
+ TCGv width = tcg_temp_new();
+ TCGv offset = tcg_temp_new();
+ TCGv_i64 mask = tcg_temp_new_i64();
+ TCGv_i64 result = tcg_temp_new_i64();
+ TCGv_i64 tmp = tcg_temp_new_i64();
+ TCGv_i64 offset64 = tcg_temp_new_i64();
+ TCGLabel *label = gen_new_label();
+ TCGLabel *done = gen_new_label();
+
+ tcg_gen_extrh_i64_i32(width, RttV);
+ tcg_gen_extract_tl(width, width, 0, 6);
+ tcg_gen_extrl_i64_i32(offset, RttV);
+ tcg_gen_sextract_tl(offset, offset, 0, 7);
+ /* Possible values for offset are -64 .. 63 */
+ tcg_gen_brcondi_tl(TCG_COND_GE, offset, 0, label);
+ /* For negative offsets, zero out the result */
+ tcg_gen_movi_tl(RxV, 0);
+ tcg_gen_br(done);
+ gen_set_label(label);
+ /* At this point, possible values of offset are 0 .. 63 */
+ tcg_gen_ext_i32_i64(mask, width);
+ tcg_gen_shl_i64(mask, tcg_constant_i64(1), mask);
+ tcg_gen_subi_i64(mask, mask, 1);
+ tcg_gen_extu_i32_i64(result, RxV);
+ tcg_gen_ext_i32_i64(tmp, offset);
+ tcg_gen_shl_i64(tmp, mask, tmp);
+ tcg_gen_andc_i64(result, result, tmp);
+ tcg_gen_extu_i32_i64(tmp, RsV);
+ tcg_gen_and_i64(tmp, tmp, mask);
+ tcg_gen_extu_i32_i64(offset64, offset);
+ tcg_gen_shl_i64(tmp, tmp, offset64);
+ tcg_gen_or_i64(result, result, tmp);
+ tcg_gen_extrl_i64_i32(RxV, result);
+ gen_set_label(done);
+}
+
+static void gen_asr_r_svw_trun(DisasContext *ctx, TCGv RdV,
+ TCGv_i64 RssV, TCGv RtV)
+{
+ /*
+ * for (int i = 0; i < 2; i++) {
+ * fSETHALF(i, RdV, fGETHALF(0, ((fSXTN(7, 32, RtV) > 0) ?
+ * (fCAST4_8s(fGETWORD(i, RssV)) >> fSXTN(7, 32, RtV)) :
+ * (fCAST4_8s(fGETWORD(i, RssV)) << -fSXTN(7, 32, RtV)))));
+ * }
+ */
+ TCGv shift_amt32 = tcg_temp_new();
+ TCGv_i64 shift_amt64 = tcg_temp_new_i64();
+ TCGv_i64 tmp64 = tcg_temp_new_i64();
+ TCGv tmp32 = tcg_temp_new();
+ TCGLabel *label = gen_new_label();
+ TCGLabel *zero = gen_new_label();
+ TCGLabel *done = gen_new_label();
+
+ tcg_gen_sextract_tl(shift_amt32, RtV, 0, 7);
+ /* Possible values of shift_amt32 are -64 .. 63 */
+ tcg_gen_brcondi_tl(TCG_COND_LE, shift_amt32, 0, label);
+ /* After branch, possible values of shift_amt32 are 1 .. 63 */
+ tcg_gen_ext_i32_i64(shift_amt64, shift_amt32);
+ for (int i = 0; i < 2; i++) {
+ tcg_gen_sextract_i64(tmp64, RssV, i * 32, 32);
+ tcg_gen_sar_i64(tmp64, tmp64, shift_amt64);
+ tcg_gen_extrl_i64_i32(tmp32, tmp64);
+ tcg_gen_deposit_tl(RdV, RdV, tmp32, i * 16, 16);
+ }
+ tcg_gen_br(done);
+ gen_set_label(label);
+ tcg_gen_neg_tl(shift_amt32, shift_amt32);
+ /*At this point, possible values of shift_amt32 are 0 .. 64 */
+ tcg_gen_brcondi_tl(TCG_COND_GT, shift_amt32, 63, zero);
+ /*At this point, possible values of shift_amt32 are 0 .. 63 */
+ tcg_gen_ext_i32_i64(shift_amt64, shift_amt32);
+ for (int i = 0; i < 2; i++) {
+ tcg_gen_sextract_i64(tmp64, RssV, i * 32, 32);
+ tcg_gen_shl_i64(tmp64, tmp64, shift_amt64);
+ tcg_gen_extrl_i64_i32(tmp32, tmp64);
+ tcg_gen_deposit_tl(RdV, RdV, tmp32, i * 16, 16);
+ }
+ tcg_gen_br(done);
+ gen_set_label(zero);
+ /* When the shift_amt is 64, zero out the result */
+ tcg_gen_movi_tl(RdV, 0);
+ gen_set_label(done);
+}
+
static intptr_t vreg_src_off(DisasContext *ctx, int num)
{
intptr_t offset = offsetof(CPUHexagonState, VRegs[num]);
diff --git a/tests/tcg/hexagon/read_write_overlap.c b/tests/tcg/hexagon/read_write_overlap.c
new file mode 100644
index 0000000000..a75fc11dc4
--- /dev/null
+++ b/tests/tcg/hexagon/read_write_overlap.c
@@ -0,0 +1,136 @@
+/*
+ * Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Test instructions where the semantics write to the destination
+ * before all the operand reads have been completed.
+ *
+ * These instructions are problematic when we short-circuit the
+ * register writes because the destination and source operands could
+ * be the same TCGv.
+ *
+ * We test by forcing the read and write to be register r7.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+
+int err;
+
+static void __check(const char *filename, int line, int x, int expect)
+{
+ if (x != expect) {
+ printf("ERROR %s:%d - 0x%08x != 0x%08x\n",
+ filename, line, x, expect);
+ err++;
+ }
+}
+
+#define check(x, expect) __check(__FILE__, __LINE__, (x), (expect))
+
+#define insert(RES, X, WIDTH, OFFSET) \
+ asm("r7 = %1\n\t" \
+ "r7 = insert(r7, #" #WIDTH ", #" #OFFSET ")\n\t" \
+ "%0 = r7\n\t" \
+ : "=r"(RES) : "r"(X) : "r7")
+
+static void test_insert(void)
+{
+ uint32_t res;
+
+ insert(res, 0x12345678, 8, 1);
+ check(res, 0x123456f0);
+ insert(res, 0x12345678, 0, 1);
+ check(res, 0x12345678);
+ insert(res, 0x12345678, 20, 16);
+ check(res, 0x56785678);
+}
+
+static inline uint32_t insert_rp(uint32_t x, uint32_t width, uint32_t offset)
+{
+ uint64_t width_offset = (uint64_t)width << 32 | offset;
+ uint32_t res;
+ asm("r7 = %1\n\t"
+ "r7 = insert(r7, %2)\n\t"
+ "%0 = r7\n\t"
+ : "=r"(res) : "r"(x), "r"(width_offset) : "r7");
+ return res;
+
+}
+
+static void test_insert_rp(void)
+{
+ check(insert_rp(0x12345678, 8, 1), 0x123456f0);
+ check(insert_rp(0x12345678, 63, 8), 0x34567878);
+ check(insert_rp(0x12345678, 127, 8), 0x34567878);
+ check(insert_rp(0x12345678, 8, 24), 0x78345678);
+ check(insert_rp(0x12345678, 8, 63), 0x12345678);
+ check(insert_rp(0x12345678, 8, 64), 0x00000000);
+}
+
+static inline uint32_t asr_r_svw_trun(uint64_t x, uint32_t y)
+{
+ uint32_t res;
+ asm("r7 = %2\n\t"
+ "r7 = vasrw(%1, r7)\n\t"
+ "%0 = r7\n\t"
+ : "=r"(res) : "r"(x), "r"(y) : "r7");
+ return res;
+}
+
+static void test_asr_r_svw_trun(void)
+{
+ check(asr_r_svw_trun(0x1111111122222222ULL, 5),
+ 0x88881111);
+ check(asr_r_svw_trun(0x1111111122222222ULL, 63),
+ 0x00000000);
+ check(asr_r_svw_trun(0x1111111122222222ULL, 64),
+ 0x00000000);
+ check(asr_r_svw_trun(0x1111111122222222ULL, 127),
+ 0x22224444);
+ check(asr_r_svw_trun(0x1111111122222222ULL, 128),
+ 0x11112222);
+ check(asr_r_svw_trun(0xffffffff22222222ULL, 128),
+ 0xffff2222);
+}
+
+static inline uint32_t swiz(uint32_t x)
+{
+ uint32_t res;
+ asm("r7 = %1\n\t"
+ "r7 = swiz(r7)\n\t"
+ "%0 = r7\n\t"
+ : "=r"(res) : "r"(x) : "r7");
+ return res;
+}
+
+static void test_swiz(void)
+{
+ check(swiz(0x11223344), 0x44332211);
+}
+
+int main()
+{
+ test_insert();
+ test_insert_rp();
+ test_asr_r_svw_trun();
+ test_swiz();
+
+ puts(err ? "FAIL" : "PASS");
+ return err ? EXIT_FAILURE : EXIT_SUCCESS;
+}
diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target
index 7c94db4bc4..d8d3793732 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -45,6 +45,7 @@ HEX_TESTS += fpstuff
HEX_TESTS += overflow
HEX_TESTS += signal_context
HEX_TESTS += reg_mut
+HEX_TESTS += read_write_overlap
HEX_TESTS += vector_add_int
HEX_TESTS += scatter_gather
HEX_TESTS += hvx_misc
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 16/21] Hexagon (target/hexagon) Make special new_value for USR
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (14 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 15/21] Hexagon (target/hexagon) Add overrides for disabled idef-parser insns Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 17/21] Hexagon (target/hexagon) Move new_value to DisasContext Taylor Simpson
` (4 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
Precursor to moving new_value from the global state to DisasContext
USR will need to stay in the global state because some helpers will
set it's value
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 1 +
target/hexagon/genptr.h | 1 +
target/hexagon/macros.h | 2 +-
target/hexagon/translate.h | 1 +
target/hexagon/genptr.c | 8 ++++++--
target/hexagon/translate.c | 22 +++++++++++++++-------
target/hexagon/README | 2 +-
target/hexagon/gen_tcg_funcs.py | 2 +-
8 files changed, 27 insertions(+), 12 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 9252055a38..3687f2caa2 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -86,6 +86,7 @@ typedef struct CPUArchState {
uint8_t slot_cancelled;
target_ulong new_value[TOTAL_PER_THREAD_REGS];
+ target_ulong new_value_usr;
/*
* Only used when HEX_DEBUG is on, but unconditionally included
diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
index e11ccc2358..a4b43c2910 100644
--- a/target/hexagon/genptr.h
+++ b/target/hexagon/genptr.h
@@ -35,6 +35,7 @@ void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, uint32_t slot);
void gen_store8i(TCGv_env cpu_env, TCGv vaddr, int64_t src, uint32_t slot);
TCGv gen_read_reg(TCGv result, int num);
TCGv gen_read_preg(TCGv pred, uint8_t num);
+TCGv get_result_gpr(DisasContext *ctx, int rnum);
TCGv get_result_pred(DisasContext *ctx, int pnum);
void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv val);
void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val);
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index a68446a367..27172193a0 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -46,7 +46,7 @@
#define SET_USR_FIELD(FIELD, VAL) \
do { \
if (pkt_need_commit) { \
- fINSERT_BITS(env->new_value[HEX_REG_USR], \
+ fINSERT_BITS(env->new_value_usr, \
reg_field_info[FIELD].width, \
reg_field_info[FIELD].offset, (VAL)); \
} else { \
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 26bcae0395..4c17433a6f 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -191,6 +191,7 @@ extern TCGv hex_this_PC;
extern TCGv hex_slot_cancelled;
extern TCGv hex_branch_taken;
extern TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_new_value_usr;
extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
extern TCGv hex_new_pred_value[NUM_PREGS];
extern TCGv hex_pred_written;
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 0727d4524b..ede1474ea5 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -68,10 +68,14 @@ static inline void gen_masked_reg_write(TCGv new_val, TCGv cur_val,
}
}
-static TCGv get_result_gpr(DisasContext *ctx, int rnum)
+TCGv get_result_gpr(DisasContext *ctx, int rnum)
{
if (ctx->need_commit) {
- return hex_new_value[rnum];
+ if (rnum == HEX_REG_USR) {
+ return hex_new_value_usr;
+ } else {
+ return hex_new_value[rnum];
+ }
} else {
return hex_gpr[rnum];
}
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index c7a04e34d2..d46a724c1b 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -45,6 +45,7 @@ TCGv hex_this_PC;
TCGv hex_slot_cancelled;
TCGv hex_branch_taken;
TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
+TCGv hex_new_value_usr;
TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
TCGv hex_new_pred_value[NUM_PREGS];
TCGv hex_pred_written;
@@ -547,12 +548,12 @@ static void gen_start_packet(DisasContext *ctx)
tcg_gen_movi_tl(hex_pred_written, 0);
}
- /* Preload the predicated registers into hex_new_value[i] */
+ /* Preload the predicated registers into get_result_gpr(ctx, i) */
if (ctx->need_commit &&
!bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) {
int i = find_first_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS);
while (i < TOTAL_PER_THREAD_REGS) {
- tcg_gen_mov_tl(hex_new_value[i], hex_gpr[i]);
+ tcg_gen_mov_tl(get_result_gpr(ctx, i), hex_gpr[i]);
i = find_next_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS,
i + 1);
}
@@ -664,7 +665,7 @@ static void gen_reg_writes(DisasContext *ctx)
for (i = 0; i < ctx->reg_log_idx; i++) {
int reg_num = ctx->reg_log[i];
- tcg_gen_mov_tl(hex_gpr[reg_num], hex_new_value[reg_num]);
+ tcg_gen_mov_tl(hex_gpr[reg_num], get_result_gpr(ctx, reg_num));
/*
* ctx->is_tight_loop is set when SA0 points to the beginning of the TB.
@@ -1177,10 +1178,14 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, gpr[i]),
hexagon_regnames[i]);
- snprintf(new_value_names[i], NAME_LEN, "new_%s", hexagon_regnames[i]);
- hex_new_value[i] = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, new_value[i]),
- new_value_names[i]);
+ if (i == HEX_REG_USR) {
+ hex_new_value[i] = NULL;
+ } else {
+ snprintf(new_value_names[i], NAME_LEN, "new_%s", hexagon_regnames[i]);
+ hex_new_value[i] = tcg_global_mem_new(cpu_env,
+ offsetof(CPUHexagonState, new_value[i]),
+ new_value_names[i]);
+ }
if (HEX_DEBUG) {
snprintf(reg_written_names[i], NAME_LEN, "reg_written_%s",
@@ -1190,6 +1195,9 @@ void hexagon_translate_init(void)
reg_written_names[i]);
}
}
+ hex_new_value_usr = tcg_global_mem_new(cpu_env,
+ offsetof(CPUHexagonState, new_value_usr), "new_value_usr");
+
for (i = 0; i < NUM_PREGS; i++) {
hex_pred[i] = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, pred[i]),
diff --git a/target/hexagon/README b/target/hexagon/README
index fe90df63e8..a9a517cfc8 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -186,7 +186,7 @@ We also generate an analyze_<tag> function for each instruction. Currently,
these functions record the writes to registers by calling ctx_log_*. During
gen_start_packet, we invoke the analyze_<tag> function for each instruction in
the packet, and we mark the implicit writes. After the analysis is performed,
-we initialize hex_new_value for each of the predicated assignments.
+we initialize the result register for each of the predicated assignments.
In addition to instruction semantics, we use a generator to create the decode
tree. This generation is also a two step process. The first step is to run
diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index 0e45d43685..a36117d57f 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -190,7 +190,7 @@ def genptr_decl_new(f, tag, regtype, regid, regno):
if regid in {"s", "t"}:
f.write(
f" TCGv {regtype}{regid}N = "
- f"hex_new_value[insn->regno[{regno}]];\n"
+ f"get_result_gpr(ctx, insn->regno[{regno}]);\n"
)
else:
print("Bad register parse: ", regtype, regid)
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 17/21] Hexagon (target/hexagon) Move new_value to DisasContext
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (15 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 16/21] Hexagon (target/hexagon) Make special new_value for USR Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 18/21] Hexagon (target/hexagon) Move new_pred_value " Taylor Simpson
` (3 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The new_value array in the CPUHexagonState is only used for bookkeeping
within the translation of a packet. With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 1 -
target/hexagon/translate.h | 2 +-
target/hexagon/genptr.c | 6 +++++-
target/hexagon/translate.c | 14 +++-----------
4 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 3687f2caa2..22aba20be2 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -85,7 +85,6 @@ typedef struct CPUArchState {
target_ulong stack_start;
uint8_t slot_cancelled;
- target_ulong new_value[TOTAL_PER_THREAD_REGS];
target_ulong new_value_usr;
/*
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 4c17433a6f..6dde487566 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -69,6 +69,7 @@ typedef struct DisasContext {
bool need_pkt_has_store_s1;
bool short_circuit;
bool has_hvx_helper;
+ TCGv new_value[TOTAL_PER_THREAD_REGS];
} DisasContext;
static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
@@ -190,7 +191,6 @@ extern TCGv hex_pred[NUM_PREGS];
extern TCGv hex_this_PC;
extern TCGv hex_slot_cancelled;
extern TCGv hex_branch_taken;
-extern TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
extern TCGv hex_new_value_usr;
extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
extern TCGv hex_new_pred_value[NUM_PREGS];
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index ede1474ea5..c7a8e2ce55 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -74,7 +74,11 @@ TCGv get_result_gpr(DisasContext *ctx, int rnum)
if (rnum == HEX_REG_USR) {
return hex_new_value_usr;
} else {
- return hex_new_value[rnum];
+ if (ctx->new_value[rnum] == NULL) {
+ ctx->new_value[rnum] = tcg_temp_new();
+ tcg_gen_movi_tl(ctx->new_value[rnum], 0);
+ }
+ return ctx->new_value[rnum];
}
} else {
return hex_gpr[rnum];
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index d46a724c1b..5f35bb20e7 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -44,7 +44,6 @@ TCGv hex_pred[NUM_PREGS];
TCGv hex_this_PC;
TCGv hex_slot_cancelled;
TCGv hex_branch_taken;
-TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
TCGv hex_new_value_usr;
TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
TCGv hex_new_pred_value[NUM_PREGS];
@@ -513,6 +512,9 @@ static void gen_start_packet(DisasContext *ctx)
}
ctx->s1_store_processed = false;
ctx->pre_commit = true;
+ for (i = 0; i < TOTAL_PER_THREAD_REGS; i++) {
+ ctx->new_value[i] = NULL;
+ }
analyze_packet(ctx);
@@ -1156,7 +1158,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int *max_insns,
}
#define NAME_LEN 64
-static char new_value_names[TOTAL_PER_THREAD_REGS][NAME_LEN];
static char reg_written_names[TOTAL_PER_THREAD_REGS][NAME_LEN];
static char new_pred_value_names[NUM_PREGS][NAME_LEN];
static char store_addr_names[STORES_MAX][NAME_LEN];
@@ -1178,15 +1179,6 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, gpr[i]),
hexagon_regnames[i]);
- if (i == HEX_REG_USR) {
- hex_new_value[i] = NULL;
- } else {
- snprintf(new_value_names[i], NAME_LEN, "new_%s", hexagon_regnames[i]);
- hex_new_value[i] = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, new_value[i]),
- new_value_names[i]);
- }
-
if (HEX_DEBUG) {
snprintf(reg_written_names[i], NAME_LEN, "reg_written_%s",
hexagon_regnames[i]);
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 18/21] Hexagon (target/hexagon) Move new_pred_value to DisasContext
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (16 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 17/21] Hexagon (target/hexagon) Move new_value to DisasContext Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 19/21] Hexagon (target/hexagon) Move pred_written " Taylor Simpson
` (2 subsequent siblings)
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The new_pred_value array in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 1 -
target/hexagon/gen_tcg.h | 12 ++++++------
target/hexagon/translate.h | 2 +-
target/hexagon/genptr.c | 10 +++++++---
target/hexagon/idef-parser/parser-helpers.c | 2 +-
target/hexagon/op_helper.c | 2 +-
target/hexagon/translate.c | 16 ++++++----------
target/hexagon/gen_tcg_funcs.py | 2 +-
8 files changed, 23 insertions(+), 24 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 22aba20be2..8ce2ceeee4 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -94,7 +94,6 @@ typedef struct CPUArchState {
target_ulong this_PC;
target_ulong reg_written[TOTAL_PER_THREAD_REGS];
- target_ulong new_pred_value[NUM_PREGS];
target_ulong pred_written;
MemLog mem_log_stores[STORES_MAX];
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index fabc1eb623..97dfdcb326 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -581,9 +581,9 @@
#define fGEN_TCG_SL2_return_f(SHORTCODE) \
gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_pred[0])
#define fGEN_TCG_SL2_return_tnew(SHORTCODE) \
- gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_new_pred_value[0])
+ gen_cond_return_subinsn(ctx, TCG_COND_EQ, ctx->new_pred_value[0])
#define fGEN_TCG_SL2_return_fnew(SHORTCODE) \
- gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_new_pred_value[0])
+ gen_cond_return_subinsn(ctx, TCG_COND_NE, ctx->new_pred_value[0])
/*
* Mathematical operations with more than one definition require
@@ -1118,7 +1118,7 @@
#define fGEN_TCG_SA1_clrtnew(SHORTCODE) \
do { \
tcg_gen_movcond_tl(TCG_COND_EQ, RdV, \
- hex_new_pred_value[0], tcg_constant_tl(0), \
+ ctx->new_pred_value[0], tcg_constant_tl(0), \
RdV, tcg_constant_tl(0)); \
} while (0)
@@ -1126,7 +1126,7 @@
#define fGEN_TCG_SA1_clrfnew(SHORTCODE) \
do { \
tcg_gen_movcond_tl(TCG_COND_NE, RdV, \
- hex_new_pred_value[0], tcg_constant_tl(0), \
+ ctx->new_pred_value[0], tcg_constant_tl(0), \
RdV, tcg_constant_tl(0)); \
} while (0)
@@ -1153,9 +1153,9 @@
gen_cond_jumpr31(ctx, TCG_COND_NE, hex_pred[0])
#define fGEN_TCG_SL2_jumpr31_tnew(SHORTCODE) \
- gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_new_pred_value[0])
+ gen_cond_jumpr31(ctx, TCG_COND_EQ, ctx->new_pred_value[0])
#define fGEN_TCG_SL2_jumpr31_fnew(SHORTCODE) \
- gen_cond_jumpr31(ctx, TCG_COND_NE, hex_new_pred_value[0])
+ gen_cond_jumpr31(ctx, TCG_COND_NE, ctx->new_pred_value[0])
/* Count trailing zeros/ones */
#define fGEN_TCG_S2_ct0(SHORTCODE) \
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 6dde487566..fdfa1b6fe3 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -70,6 +70,7 @@ typedef struct DisasContext {
bool short_circuit;
bool has_hvx_helper;
TCGv new_value[TOTAL_PER_THREAD_REGS];
+ TCGv new_pred_value[NUM_PREGS];
} DisasContext;
static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
@@ -193,7 +194,6 @@ extern TCGv hex_slot_cancelled;
extern TCGv hex_branch_taken;
extern TCGv hex_new_value_usr;
extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
-extern TCGv hex_new_pred_value[NUM_PREGS];
extern TCGv hex_pred_written;
extern TCGv hex_store_addr[STORES_MAX];
extern TCGv hex_store_width[STORES_MAX];
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index c7a8e2ce55..c71bea0530 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -121,7 +121,11 @@ static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 val)
TCGv get_result_pred(DisasContext *ctx, int pnum)
{
if (ctx->need_commit) {
- return hex_new_pred_value[pnum];
+ if (ctx->new_pred_value[pnum] == NULL) {
+ ctx->new_pred_value[pnum] = tcg_temp_new();
+ tcg_gen_movi_tl(ctx->new_pred_value[pnum], 0);
+ }
+ return ctx->new_pred_value[pnum];
} else {
return hex_pred[pnum];
}
@@ -607,7 +611,7 @@ static void gen_cmpnd_cmp_jmp(DisasContext *ctx,
gen_log_pred_write(ctx, pnum, pred);
} else {
TCGv pred = tcg_temp_new();
- tcg_gen_mov_tl(pred, hex_new_pred_value[pnum]);
+ tcg_gen_mov_tl(pred, ctx->new_pred_value[pnum]);
gen_cond_jump(ctx, cond2, pred, pc_off);
}
}
@@ -664,7 +668,7 @@ static void gen_cmpnd_tstbit0_jmp(DisasContext *ctx,
gen_log_pred_write(ctx, pnum, pred);
} else {
TCGv pred = tcg_temp_new();
- tcg_gen_mov_tl(pred, hex_new_pred_value[pnum]);
+ tcg_gen_mov_tl(pred, ctx->new_pred_value[pnum]);
gen_cond_jump(ctx, cond, pred, pc_off);
}
}
diff --git a/target/hexagon/idef-parser/parser-helpers.c b/target/hexagon/idef-parser/parser-helpers.c
index ae0f60ada4..75c3b3efed 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -1856,7 +1856,7 @@ HexValue gen_rvalue_pred(Context *c, YYLTYPE *locp, HexValue *pred)
*pred = gen_tmp(c, locp, 32, UNSIGNED);
if (is_dotnew) {
OUT(c, locp, "tcg_gen_mov_i32(", pred,
- ", hex_new_pred_value[");
+ ", ctx->new_pred_value[");
OUT(c, locp, pred_str, "]);\n");
} else {
OUT(c, locp, "gen_read_preg(", pred, ", ", pred_str, ");\n");
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index fc5c30a141..26fba9f5d6 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -231,7 +231,7 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1)
pred_printed = true;
}
HEX_DEBUG_LOG("\tp%d = 0x" TARGET_FMT_lx "\n",
- i, env->new_pred_value[i]);
+ i, env->pred[i]);
}
}
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 5f35bb20e7..890badac10 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -46,7 +46,6 @@ TCGv hex_slot_cancelled;
TCGv hex_branch_taken;
TCGv hex_new_value_usr;
TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
-TCGv hex_new_pred_value[NUM_PREGS];
TCGv hex_pred_written;
TCGv hex_store_addr[STORES_MAX];
TCGv hex_store_width[STORES_MAX];
@@ -515,6 +514,9 @@ static void gen_start_packet(DisasContext *ctx)
for (i = 0; i < TOTAL_PER_THREAD_REGS; i++) {
ctx->new_value[i] = NULL;
}
+ for (i = 0; i < NUM_PREGS; i++) {
+ ctx->new_pred_value[i] = NULL;
+ }
analyze_packet(ctx);
@@ -568,7 +570,8 @@ static void gen_start_packet(DisasContext *ctx)
if (ctx->need_commit && pkt->pkt_has_endloop) {
for (int i = 0; i < ctx->preg_log_idx; i++) {
int pred_num = ctx->preg_log[i];
- tcg_gen_mov_tl(hex_new_pred_value[pred_num], hex_pred[pred_num]);
+ ctx->new_pred_value[pred_num] = tcg_temp_new();
+ tcg_gen_mov_tl(ctx->new_pred_value[pred_num], hex_pred[pred_num]);
}
}
@@ -688,7 +691,7 @@ static void gen_pred_writes(DisasContext *ctx)
for (int i = 0; i < ctx->preg_log_idx; i++) {
int pred_num = ctx->preg_log[i];
- tcg_gen_mov_tl(hex_pred[pred_num], hex_new_pred_value[pred_num]);
+ tcg_gen_mov_tl(hex_pred[pred_num], ctx->new_pred_value[pred_num]);
}
}
@@ -1159,7 +1162,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int *max_insns,
#define NAME_LEN 64
static char reg_written_names[TOTAL_PER_THREAD_REGS][NAME_LEN];
-static char new_pred_value_names[NUM_PREGS][NAME_LEN];
static char store_addr_names[STORES_MAX][NAME_LEN];
static char store_width_names[STORES_MAX][NAME_LEN];
static char store_val32_names[STORES_MAX][NAME_LEN];
@@ -1194,12 +1196,6 @@ void hexagon_translate_init(void)
hex_pred[i] = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, pred[i]),
hexagon_prednames[i]);
-
- snprintf(new_pred_value_names[i], NAME_LEN, "new_pred_%s",
- hexagon_prednames[i]);
- hex_new_pred_value[i] = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, new_pred_value[i]),
- new_pred_value_names[i]);
}
hex_pred_written = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, pred_written), "pred_written");
diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index a36117d57f..0403547387 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -198,7 +198,7 @@ def genptr_decl_new(f, tag, regtype, regid, regno):
if regid in {"t", "u", "v"}:
f.write(
f" TCGv {regtype}{regid}N = "
- f"hex_new_pred_value[insn->regno[{regno}]];\n"
+ f"ctx->new_pred_value[insn->regno[{regno}]];\n"
)
else:
print("Bad register parse: ", regtype, regid)
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 19/21] Hexagon (target/hexagon) Move pred_written to DisasContext
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (17 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 18/21] Hexagon (target/hexagon) Move new_pred_value " Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 20/21] Hexagon (target/hexagon) Move pkt_has_store_s1 " Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 21/21] Hexagon (target/hexagon) Move items " Taylor Simpson
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The pred_written variable in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 2 --
target/hexagon/helper.h | 2 +-
target/hexagon/translate.h | 2 +-
target/hexagon/genptr.c | 2 +-
target/hexagon/op_helper.c | 5 +++--
target/hexagon/translate.c | 9 ++++-----
6 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 8ce2ceeee4..26952cddcb 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -94,8 +94,6 @@ typedef struct CPUArchState {
target_ulong this_PC;
target_ulong reg_written[TOTAL_PER_THREAD_REGS];
- target_ulong pred_written;
-
MemLog mem_log_stores[STORES_MAX];
target_ulong pkt_has_store_s1;
target_ulong dczero_addr;
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index 4b750d0351..f3b298beee 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -21,7 +21,7 @@
DEF_HELPER_FLAGS_2(raise_exception, TCG_CALL_NO_RETURN, noreturn, env, i32)
DEF_HELPER_1(debug_start_packet, void, env)
DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, void, env, int, int)
-DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int)
+DEF_HELPER_FLAGS_4(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int, int)
DEF_HELPER_2(commit_store, void, env, int)
DEF_HELPER_3(gather_store, void, env, i32, int)
DEF_HELPER_1(commit_hvx_stores, void, env)
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index fdfa1b6fe3..a9f1ccee24 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -71,6 +71,7 @@ typedef struct DisasContext {
bool has_hvx_helper;
TCGv new_value[TOTAL_PER_THREAD_REGS];
TCGv new_pred_value[NUM_PREGS];
+ TCGv pred_written;
} DisasContext;
static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
@@ -194,7 +195,6 @@ extern TCGv hex_slot_cancelled;
extern TCGv hex_branch_taken;
extern TCGv hex_new_value_usr;
extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
-extern TCGv hex_pred_written;
extern TCGv hex_store_addr[STORES_MAX];
extern TCGv hex_store_width[STORES_MAX];
extern TCGv hex_store_val32[STORES_MAX];
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index c71bea0530..1ad4d636f8 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -151,7 +151,7 @@ void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
tcg_gen_and_tl(pred, pred, base_val);
}
if (HEX_DEBUG) {
- tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
+ tcg_gen_ori_tl(ctx->pred_written, ctx->pred_written, 1 << pnum);
}
set_bit(pnum, ctx->pregs_written);
}
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 26fba9f5d6..f9021efc7e 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -203,7 +203,8 @@ static void print_store(CPUHexagonState *env, int slot)
}
/* This function is a handy place to set a breakpoint */
-void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1)
+void HELPER(debug_commit_end)(CPUHexagonState *env,
+ int pred_written, int has_st0, int has_st1)
{
bool reg_printed = false;
bool pred_printed = false;
@@ -225,7 +226,7 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1)
}
for (i = 0; i < NUM_PREGS; i++) {
- if (env->pred_written & (1 << i)) {
+ if (pred_written & (1 << i)) {
if (!pred_printed) {
HEX_DEBUG_LOG("Predicates written\n");
pred_printed = true;
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 890badac10..b185dda35a 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -46,7 +46,6 @@ TCGv hex_slot_cancelled;
TCGv hex_branch_taken;
TCGv hex_new_value_usr;
TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
-TCGv hex_pred_written;
TCGv hex_store_addr[STORES_MAX];
TCGv hex_store_width[STORES_MAX];
TCGv hex_store_val32[STORES_MAX];
@@ -549,7 +548,8 @@ static void gen_start_packet(DisasContext *ctx)
}
}
if (HEX_DEBUG) {
- tcg_gen_movi_tl(hex_pred_written, 0);
+ ctx->pred_written = tcg_temp_new();
+ tcg_gen_movi_tl(ctx->pred_written, 0);
}
/* Preload the predicated registers into get_result_gpr(ctx, i) */
@@ -1004,7 +1004,8 @@ static void gen_commit_packet(DisasContext *ctx)
tcg_constant_tl(pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa);
/* Handy place to set a breakpoint at the end of execution */
- gen_helper_debug_commit_end(cpu_env, has_st0, has_st1);
+ gen_helper_debug_commit_end(cpu_env, ctx->pred_written,
+ has_st0, has_st1);
}
if (pkt->vhist_insn != NULL) {
@@ -1197,8 +1198,6 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, pred[i]),
hexagon_prednames[i]);
}
- hex_pred_written = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, pred_written), "pred_written");
hex_this_PC = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, this_PC), "this_PC");
hex_slot_cancelled = tcg_global_mem_new(cpu_env,
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 20/21] Hexagon (target/hexagon) Move pkt_has_store_s1 to DisasContext
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (18 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 19/21] Hexagon (target/hexagon) Move pred_written " Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 21/21] Hexagon (target/hexagon) Move items " Taylor Simpson
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The pkt_has_store_s1 field is only used for bookkeeping helpers with
a load. With recent changes that eliminate the need to free TCGv
variables, it makes more sense to make this transient.
These helpers already take the instruction slot as an argument. We
combine the slot and pkt_has_store_s1 into a single argument called
slotval.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 1 -
target/hexagon/macros.h | 16 ++++++++--------
target/hexagon/op_helper.h | 12 ++++++++----
target/hexagon/translate.h | 1 -
target/hexagon/genptr.c | 8 ++++++++
target/hexagon/op_helper.c | 26 +++++++++++++++-----------
target/hexagon/translate.c | 7 -------
target/hexagon/gen_analyze_funcs.py | 2 --
target/hexagon/gen_helper_funcs.py | 7 ++++++-
target/hexagon/gen_tcg_funcs.py | 4 ++--
target/hexagon/hex_common.py | 7 ++++---
11 files changed, 51 insertions(+), 40 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 26952cddcb..72b7d79279 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -95,7 +95,6 @@ typedef struct CPUArchState {
target_ulong reg_written[TOTAL_PER_THREAD_REGS];
MemLog mem_log_stores[STORES_MAX];
- target_ulong pkt_has_store_s1;
target_ulong dczero_addr;
float_status fp_status;
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 27172193a0..f5ebaf7f54 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -173,14 +173,14 @@
#define MEM_STORE8(VA, DATA, SLOT) \
MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
#else
-#define MEM_LOAD1s(VA) ((int8_t)mem_load1(env, slot, VA))
-#define MEM_LOAD1u(VA) ((uint8_t)mem_load1(env, slot, VA))
-#define MEM_LOAD2s(VA) ((int16_t)mem_load2(env, slot, VA))
-#define MEM_LOAD2u(VA) ((uint16_t)mem_load2(env, slot, VA))
-#define MEM_LOAD4s(VA) ((int32_t)mem_load4(env, slot, VA))
-#define MEM_LOAD4u(VA) ((uint32_t)mem_load4(env, slot, VA))
-#define MEM_LOAD8s(VA) ((int64_t)mem_load8(env, slot, VA))
-#define MEM_LOAD8u(VA) ((uint64_t)mem_load8(env, slot, VA))
+#define MEM_LOAD1s(VA) ((int8_t)mem_load1(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD1u(VA) ((uint8_t)mem_load1(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD2s(VA) ((int16_t)mem_load2(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD2u(VA) ((uint16_t)mem_load2(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD4s(VA) ((int32_t)mem_load4(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD4u(VA) ((uint32_t)mem_load4(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD8s(VA) ((int64_t)mem_load8(env, pkt_has_store_s1, slot, VA))
+#define MEM_LOAD8u(VA) ((uint64_t)mem_load8(env, pkt_has_store_s1, slot, VA))
#define MEM_STORE1(VA, DATA, SLOT) log_store32(env, VA, DATA, 1, SLOT)
#define MEM_STORE2(VA, DATA, SLOT) log_store32(env, VA, DATA, 2, SLOT)
diff --git a/target/hexagon/op_helper.h b/target/hexagon/op_helper.h
index 6bd4b07849..8f3764d15e 100644
--- a/target/hexagon/op_helper.h
+++ b/target/hexagon/op_helper.h
@@ -19,10 +19,14 @@
#define HEXAGON_OP_HELPER_H
/* Misc functions */
-uint8_t mem_load1(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
-uint16_t mem_load2(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
-uint32_t mem_load4(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
-uint64_t mem_load8(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
+uint8_t mem_load1(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr);
+uint16_t mem_load2(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr);
+uint32_t mem_load4(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr);
+uint64_t mem_load8(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr);
void log_store64(CPUHexagonState *env, target_ulong addr,
int64_t val, int width, int slot);
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index a9f1ccee24..9697b4de0e 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -66,7 +66,6 @@ typedef struct DisasContext {
TCGCond branch_cond;
target_ulong branch_dest;
bool is_tight_loop;
- bool need_pkt_has_store_s1;
bool short_circuit;
bool has_hvx_helper;
TCGv new_value[TOTAL_PER_THREAD_REGS];
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 1ad4d636f8..1e98e2913c 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -398,6 +398,14 @@ static inline void gen_store_conditional8(DisasContext *ctx,
tcg_gen_movi_tl(hex_llsc_addr, ~0);
}
+#ifndef CONFIG_HEXAGON_IDEF_PARSER
+static TCGv gen_slotval(DisasContext *ctx)
+{
+ int slotval = (ctx->pkt->pkt_has_store_s1 & 1) | (ctx->insn->slot << 1);
+ return tcg_constant_tl(slotval);
+}
+#endif
+
void gen_store32(TCGv vaddr, TCGv src, int width, uint32_t slot)
{
tcg_gen_mov_tl(hex_store_addr[slot], vaddr);
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index f9021efc7e..dfabce3123 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -567,41 +567,45 @@ void HELPER(probe_pkt_scalar_hvx_stores)(CPUHexagonState *env, int mask)
* If the load is in slot 0 and there is a store in slot1 (that
* wasn't cancelled), we have to do the store first.
*/
-static void check_noshuf(CPUHexagonState *env, uint32_t slot,
- target_ulong vaddr, int size)
+static void check_noshuf(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr, int size)
{
- if (slot == 0 && env->pkt_has_store_s1 &&
+ if (slot == 0 && pkt_has_store_s1 &&
((env->slot_cancelled & (1 << 1)) == 0)) {
HELPER(probe_noshuf_load)(env, vaddr, size, MMU_USER_IDX);
HELPER(commit_store)(env, 1);
}
}
-uint8_t mem_load1(CPUHexagonState *env, uint32_t slot, target_ulong vaddr)
+uint8_t mem_load1(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr)
{
uintptr_t ra = GETPC();
- check_noshuf(env, slot, vaddr, 1);
+ check_noshuf(env, pkt_has_store_s1, slot, vaddr, 1);
return cpu_ldub_data_ra(env, vaddr, ra);
}
-uint16_t mem_load2(CPUHexagonState *env, uint32_t slot, target_ulong vaddr)
+uint16_t mem_load2(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr)
{
uintptr_t ra = GETPC();
- check_noshuf(env, slot, vaddr, 2);
+ check_noshuf(env, pkt_has_store_s1, slot, vaddr, 2);
return cpu_lduw_data_ra(env, vaddr, ra);
}
-uint32_t mem_load4(CPUHexagonState *env, uint32_t slot, target_ulong vaddr)
+uint32_t mem_load4(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr)
{
uintptr_t ra = GETPC();
- check_noshuf(env, slot, vaddr, 4);
+ check_noshuf(env, pkt_has_store_s1, slot, vaddr, 4);
return cpu_ldl_data_ra(env, vaddr, ra);
}
-uint64_t mem_load8(CPUHexagonState *env, uint32_t slot, target_ulong vaddr)
+uint64_t mem_load8(CPUHexagonState *env, bool pkt_has_store_s1,
+ uint32_t slot, target_ulong vaddr)
{
uintptr_t ra = GETPC();
- check_noshuf(env, slot, vaddr, 8);
+ check_noshuf(env, pkt_has_store_s1, slot, vaddr, 8);
return cpu_ldq_data_ra(env, vaddr, ra);
}
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index b185dda35a..af38c95e26 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -463,7 +463,6 @@ static void mark_implicit_pred_reads(DisasContext *ctx)
static void analyze_packet(DisasContext *ctx)
{
Packet *pkt = ctx->pkt;
- ctx->need_pkt_has_store_s1 = false;
ctx->has_hvx_helper = false;
for (int i = 0; i < pkt->num_insns; i++) {
Insn *insn = &pkt->insn[i];
@@ -519,10 +518,6 @@ static void gen_start_packet(DisasContext *ctx)
analyze_packet(ctx);
- if (ctx->need_pkt_has_store_s1) {
- tcg_gen_movi_tl(hex_pkt_has_store_s1, pkt->pkt_has_store_s1);
- }
-
/*
* pregs_written is used both in the analyze phase as well as the code
* gen phase, so clear it again.
@@ -1204,8 +1199,6 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, slot_cancelled), "slot_cancelled");
hex_branch_taken = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, branch_taken), "branch_taken");
- hex_pkt_has_store_s1 = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, pkt_has_store_s1), "pkt_has_store_s1");
hex_dczero_addr = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, dczero_addr), "dczero_addr");
hex_llsc_addr = tcg_global_mem_new(cpu_env,
diff --git a/target/hexagon/gen_analyze_funcs.py b/target/hexagon/gen_analyze_funcs.py
index 36da669450..d040f67001 100755
--- a/target/hexagon/gen_analyze_funcs.py
+++ b/target/hexagon/gen_analyze_funcs.py
@@ -209,8 +209,6 @@ def gen_analyze_func(f, tag, regs, imms):
has_generated_helper = not hex_common.skip_qemu_helper(
tag
) and not hex_common.is_idef_parser_enabled(tag)
- if has_generated_helper and "A_SCALAR_LOAD" in hex_common.attribdict[tag]:
- f.write(" ctx->need_pkt_has_store_s1 = true;\n")
## Mark HVX instructions with generated helpers
if (has_generated_helper and
diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper_funcs.py
index e259ea3d03..39751a483c 100755
--- a/target/hexagon/gen_helper_funcs.py
+++ b/target/hexagon/gen_helper_funcs.py
@@ -303,7 +303,7 @@ def gen_helper_function(f, tag, tagregs, tagimms):
if hex_common.need_slot(tag):
if i > 0:
f.write(", ")
- f.write("uint32_t slot")
+ f.write("uint32_t slotval")
i += 1
if hex_common.need_part1(tag):
if i > 0:
@@ -331,6 +331,11 @@ def gen_helper_function(f, tag, tagregs, tagimms):
else:
print("Bad register parse: ", regtype, regid, toss, numregs)
+ if hex_common.need_slot(tag):
+ if "A_LOAD" in hex_common.attribdict[tag]:
+ f.write(" bool pkt_has_store_s1 = slotval & 0x1;\n")
+ f.write(" uint32_t slot = slotval >> 1;\n")
+
if "A_FPOP" in hex_common.attribdict[tag]:
f.write(" arch_fpop_start(env);\n")
diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index 0403547387..887b1cd369 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -556,7 +556,7 @@ def gen_tcg_func(f, tag, regs, imms):
if hex_common.need_part1(tag):
f.write(" TCGv part1 = tcg_constant_tl(insn->part1);\n")
if hex_common.need_slot(tag):
- f.write(" TCGv slot = tcg_constant_tl(insn->slot);\n")
+ f.write(" TCGv slotval = gen_slotval(ctx);\n")
if hex_common.need_PC(tag):
f.write(" TCGv PC = tcg_constant_tl(ctx->pkt->pc);\n")
if hex_common.helper_needs_next_PC(tag):
@@ -606,7 +606,7 @@ def gen_tcg_func(f, tag, regs, imms):
if hex_common.helper_needs_next_PC(tag):
f.write(", next_PC")
if hex_common.need_slot(tag):
- f.write(", slot")
+ f.write(", slotval")
if hex_common.need_part1(tag):
f.write(", part1")
f.write(");\n")
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index 29c0508f66..011cce1a68 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -247,9 +247,10 @@ def is_new_val(regtype, regid, tag):
def need_slot(tag):
if (
- ("A_CONDEXEC" in attribdict[tag] and "A_JUMP" not in attribdict[tag])
- or "A_STORE" in attribdict[tag]
- or "A_LOAD" in attribdict[tag]
+ "A_CVI_SCATTER" not in attribdict[tag]
+ and "A_CVI_GATHER" not in attribdict[tag]
+ and ("A_STORE" in attribdict[tag]
+ or "A_LOAD" in attribdict[tag])
):
return 1
else:
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 21/21] Hexagon (target/hexagon) Move items to DisasContext
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
` (19 preceding siblings ...)
2023-04-27 23:00 ` [PATCH v2 20/21] Hexagon (target/hexagon) Move pkt_has_store_s1 " Taylor Simpson
@ 2023-04-27 23:00 ` Taylor Simpson
20 siblings, 0 replies; 26+ messages in thread
From: Taylor Simpson @ 2023-04-27 23:00 UTC (permalink / raw)
To: qemu-devel
Cc: tsimpson, richard.henderson, philmd, ale, anjo, bcain,
quic_mathbern
The following items in the CPUHexagonState are only used for bookkeeping
within the translation of a packet. With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.
The following items are moved
dczero_addr
branch_taken
this_PC
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/hexagon/cpu.h | 3 ---
target/hexagon/helper.h | 2 +-
target/hexagon/macros.h | 6 +++++-
target/hexagon/translate.h | 5 ++---
target/hexagon/genptr.c | 6 +++---
target/hexagon/op_helper.c | 5 ++---
target/hexagon/translate.c | 23 +++++++----------------
target/hexagon/README | 2 +-
8 files changed, 21 insertions(+), 31 deletions(-)
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 72b7d79279..d3e5be7778 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -78,7 +78,6 @@ typedef struct {
typedef struct CPUArchState {
target_ulong gpr[TOTAL_PER_THREAD_REGS];
target_ulong pred[NUM_PREGS];
- target_ulong branch_taken;
/* For comparing with LLDB on target - see adjust_stack_ptrs function */
target_ulong last_pc_dumped;
@@ -91,11 +90,9 @@ typedef struct CPUArchState {
* Only used when HEX_DEBUG is on, but unconditionally included
* to reduce recompile time when turning HEX_DEBUG on/off.
*/
- target_ulong this_PC;
target_ulong reg_written[TOTAL_PER_THREAD_REGS];
MemLog mem_log_stores[STORES_MAX];
- target_ulong dczero_addr;
float_status fp_status;
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index f3b298beee..fa0ebaf7c8 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -21,7 +21,7 @@
DEF_HELPER_FLAGS_2(raise_exception, TCG_CALL_NO_RETURN, noreturn, env, i32)
DEF_HELPER_1(debug_start_packet, void, env)
DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, void, env, int, int)
-DEF_HELPER_FLAGS_4(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int, int)
+DEF_HELPER_FLAGS_5(debug_commit_end, TCG_CALL_NO_WG, void, env, i32, int, int, int)
DEF_HELPER_2(commit_store, void, env, int)
DEF_HELPER_3(gather_store, void, env, i32, int)
DEF_HELPER_1(commit_hvx_stores, void, env)
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index f5ebaf7f54..bad27d1aeb 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -648,7 +648,11 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
reg_field_info[FIELD].offset)
#ifdef QEMU_GENERATE
-#define fDCZEROA(REG) tcg_gen_mov_tl(hex_dczero_addr, (REG))
+#define fDCZEROA(REG) \
+ do { \
+ ctx->dczero_addr = tcg_temp_new(); \
+ tcg_gen_mov_tl(ctx->dczero_addr, (REG)); \
+ } while (0)
#endif
#define fBRANCH_SPECULATE_STALL(DOTNEWVAL, JUMP_COND, SPEC_DIR, HINTBITNUM, \
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 9697b4de0e..4dd59c6726 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -71,6 +71,8 @@ typedef struct DisasContext {
TCGv new_value[TOTAL_PER_THREAD_REGS];
TCGv new_pred_value[NUM_PREGS];
TCGv pred_written;
+ TCGv branch_taken;
+ TCGv dczero_addr;
} DisasContext;
static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
@@ -189,16 +191,13 @@ static inline void ctx_log_qreg_read(DisasContext *ctx, int qnum)
extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
extern TCGv hex_pred[NUM_PREGS];
-extern TCGv hex_this_PC;
extern TCGv hex_slot_cancelled;
-extern TCGv hex_branch_taken;
extern TCGv hex_new_value_usr;
extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
extern TCGv hex_store_addr[STORES_MAX];
extern TCGv hex_store_width[STORES_MAX];
extern TCGv hex_store_val32[STORES_MAX];
extern TCGv_i64 hex_store_val64[STORES_MAX];
-extern TCGv hex_dczero_addr;
extern TCGv hex_llsc_addr;
extern TCGv hex_llsc_val;
extern TCGv_i64 hex_llsc_val_i64;
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 1e98e2913c..bd0e11247a 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -480,9 +480,9 @@ static void gen_write_new_pc_addr(DisasContext *ctx, TCGv addr,
if (ctx->pkt->pkt_has_multi_cof) {
/* If there are multiple branches in a packet, ignore the second one */
tcg_gen_movcond_tl(TCG_COND_NE, hex_gpr[HEX_REG_PC],
- hex_branch_taken, tcg_constant_tl(0),
+ ctx->branch_taken, tcg_constant_tl(0),
hex_gpr[HEX_REG_PC], addr);
- tcg_gen_movi_tl(hex_branch_taken, 1);
+ tcg_gen_movi_tl(ctx->branch_taken, 1);
} else {
tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], addr);
}
@@ -503,7 +503,7 @@ static void gen_write_new_pc_pcrel(DisasContext *ctx, int pc_off,
ctx->branch_cond = TCG_COND_ALWAYS;
if (pred != NULL) {
ctx->branch_cond = cond;
- tcg_gen_mov_tl(hex_branch_taken, pred);
+ tcg_gen_mov_tl(ctx->branch_taken, pred);
}
ctx->branch_dest = dest;
}
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index dfabce3123..12967ac21e 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -203,15 +203,14 @@ static void print_store(CPUHexagonState *env, int slot)
}
/* This function is a handy place to set a breakpoint */
-void HELPER(debug_commit_end)(CPUHexagonState *env,
+void HELPER(debug_commit_end)(CPUHexagonState *env, uint32_t this_PC,
int pred_written, int has_st0, int has_st1)
{
bool reg_printed = false;
bool pred_printed = false;
int i;
- HEX_DEBUG_LOG("Packet committed: pc = 0x" TARGET_FMT_lx "\n",
- env->this_PC);
+ HEX_DEBUG_LOG("Packet committed: pc = 0x" TARGET_FMT_lx "\n", this_PC);
HEX_DEBUG_LOG("slot_cancelled = %d\n", env->slot_cancelled);
for (i = 0; i < TOTAL_PER_THREAD_REGS; i++) {
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index af38c95e26..9099e81431 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -41,17 +41,13 @@ static const AnalyzeInsn opcode_analyze[XX_LAST_OPCODE] = {
TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
TCGv hex_pred[NUM_PREGS];
-TCGv hex_this_PC;
TCGv hex_slot_cancelled;
-TCGv hex_branch_taken;
TCGv hex_new_value_usr;
TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
TCGv hex_store_addr[STORES_MAX];
TCGv hex_store_width[STORES_MAX];
TCGv hex_store_val32[STORES_MAX];
TCGv_i64 hex_store_val64[STORES_MAX];
-TCGv hex_pkt_has_store_s1;
-TCGv hex_dczero_addr;
TCGv hex_llsc_addr;
TCGv hex_llsc_val;
TCGv_i64 hex_llsc_val_i64;
@@ -157,7 +153,7 @@ static void gen_end_tb(DisasContext *ctx)
if (ctx->branch_cond != TCG_COND_NEVER) {
if (ctx->branch_cond != TCG_COND_ALWAYS) {
TCGLabel *skip = gen_new_label();
- tcg_gen_brcondi_tl(ctx->branch_cond, hex_branch_taken, 0, skip);
+ tcg_gen_brcondi_tl(ctx->branch_cond, ctx->branch_taken, 0, skip);
gen_goto_tb(ctx, 0, ctx->branch_dest, true);
gen_set_label(skip);
gen_goto_tb(ctx, 1, ctx->next_PC, false);
@@ -527,16 +523,17 @@ static void gen_start_packet(DisasContext *ctx)
if (HEX_DEBUG) {
/* Handy place to set a breakpoint before the packet executes */
gen_helper_debug_start_packet(cpu_env);
- tcg_gen_movi_tl(hex_this_PC, ctx->base.pc_next);
}
/* Initialize the runtime state for packet semantics */
if (need_slot_cancelled(pkt)) {
tcg_gen_movi_tl(hex_slot_cancelled, 0);
}
+ ctx->branch_taken = NULL;
if (pkt->pkt_has_cof) {
+ ctx->branch_taken = tcg_temp_new();
if (pkt->pkt_has_multi_cof) {
- tcg_gen_movi_tl(hex_branch_taken, 0);
+ tcg_gen_movi_tl(ctx->branch_taken, 0);
}
if (need_next_PC(ctx)) {
tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], next_PC);
@@ -812,7 +809,7 @@ static void process_dczeroa(DisasContext *ctx)
TCGv addr = tcg_temp_new();
TCGv_i64 zero = tcg_constant_i64(0);
- tcg_gen_andi_tl(addr, hex_dczero_addr, ~0x1f);
+ tcg_gen_andi_tl(addr, ctx->dczero_addr, ~0x1f);
tcg_gen_qemu_st64(zero, addr, ctx->mem_idx);
tcg_gen_addi_tl(addr, addr, 8);
tcg_gen_qemu_st64(zero, addr, ctx->mem_idx);
@@ -999,8 +996,8 @@ static void gen_commit_packet(DisasContext *ctx)
tcg_constant_tl(pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa);
/* Handy place to set a breakpoint at the end of execution */
- gen_helper_debug_commit_end(cpu_env, ctx->pred_written,
- has_st0, has_st1);
+ gen_helper_debug_commit_end(cpu_env, tcg_constant_tl(ctx->pkt->pc),
+ ctx->pred_written, has_st0, has_st1);
}
if (pkt->vhist_insn != NULL) {
@@ -1193,14 +1190,8 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, pred[i]),
hexagon_prednames[i]);
}
- hex_this_PC = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, this_PC), "this_PC");
hex_slot_cancelled = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, slot_cancelled), "slot_cancelled");
- hex_branch_taken = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, branch_taken), "branch_taken");
- hex_dczero_addr = tcg_global_mem_new(cpu_env,
- offsetof(CPUHexagonState, dczero_addr), "dczero_addr");
hex_llsc_addr = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, llsc_addr), "llsc_addr");
hex_llsc_val = tcg_global_mem_new(cpu_env,
diff --git a/target/hexagon/README b/target/hexagon/README
index a9a517cfc8..8ecf21d815 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -304,4 +304,4 @@ Here are some handy places to set breakpoints
At the start of execution of a packet for a given PC
br helper_debug_start_packet if env->gpr[41] == 0xdeadbeef
At the end of execution of a packet for a given PC
- br helper_debug_commit_end if env->this_PC == 0xdeadbeef
+ br helper_debug_commit_end if this_PC == 0xdeadbeef
--
2.25.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* RE: [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet register writes
2023-04-27 23:00 ` [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet register writes Taylor Simpson
@ 2023-04-28 2:47 ` Brian Cain
0 siblings, 0 replies; 26+ messages in thread
From: Brian Cain @ 2023-04-28 2:47 UTC (permalink / raw)
To: Taylor Simpson, qemu-devel@nongnu.org
Cc: Taylor Simpson, richard.henderson@linaro.org, philmd@linaro.org,
ale@rev.ng, anjo@rev.ng, Matheus Bernardino (QUIC)
> -----Original Message-----
> From: Taylor Simpson <tsimpson@quicinc.com>
> Sent: Thursday, April 27, 2023 6:00 PM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson <tsimpson@quicinc.com>; richard.henderson@linaro.org;
> philmd@linaro.org; ale@rev.ng; anjo@rev.ng; Brian Cain
> <bcain@quicinc.com>; Matheus Bernardino (QUIC)
> <quic_mathbern@quicinc.com>
> Subject: [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet
> register writes
>
> In certain cases, we can avoid the overhead of writing to hex_new_value
> and write directly to hex_gpr. We add need_commit field to DisasContext
> indicating if the end-of-packet commit is needed. If it is not needed,
> get_result_gpr() and get_result_gpr_pair() can return hex_gpr.
>
> We pass the ctx->need_commit to helpers when needed.
>
> Finally, we can early-exit from gen_reg_writes during packet commit.
>
> There are a few instructions whose semantics write to the result before
> reading all the inputs. Therefore, the idef-parser generated code is
> incompatible with short-circuit. We tell idef-parser to skip them.
>
> For debugging purposes, we add a cpu property to turn off short-circuit.
> When the short-circuit property is false, we skip the analysis and force
> the end-of-packet commit.
>
> Here's a simple example of the TCG generated for
> 0x004000b4: 0x7800c020 { R0 = #0x1 }
>
> BEFORE:
> ---- 004000b4
> movi_i32 new_r0,$0x1
> mov_i32 r0,new_r0
>
> AFTER:
> ---- 004000b4
> movi_i32 r0,$0x1
>
> This patch reintroduces a use of check_for_attrib, so we remove the
> G_GNUC_UNUSED added earlier in this series.
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/hexagon/cpu.h | 1 +
> target/hexagon/gen_tcg.h | 3 +-
> target/hexagon/genptr.h | 2 +
> target/hexagon/helper.h | 2 +-
> target/hexagon/macros.h | 13 ++++-
> target/hexagon/translate.h | 2 +
> target/hexagon/arch.c | 3 +-
> target/hexagon/cpu.c | 5 +-
> target/hexagon/genptr.c | 30 ++++-------
> target/hexagon/op_helper.c | 5 +-
> target/hexagon/translate.c | 67 ++++++++++++++++++++++++-
> target/hexagon/gen_helper_funcs.py | 2 +
> target/hexagon/gen_helper_protos.py | 10 +++-
> target/hexagon/gen_idef_parser_funcs.py | 7 +++
> target/hexagon/gen_tcg_funcs.py | 5 ++
> target/hexagon/hex_common.py | 3 ++
> 16 files changed, 129 insertions(+), 31 deletions(-)
>
> diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
> index 81b663ecfb..9252055a38 100644
> --- a/target/hexagon/cpu.h
> +++ b/target/hexagon/cpu.h
> @@ -146,6 +146,7 @@ struct ArchCPU {
>
> bool lldb_compat;
> target_ulong lldb_stack_adjust;
> + bool short_circuit;
> };
>
> #include "cpu_bits.h"
> diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
> index 2b2a6175a5..1f7e535300 100644
> --- a/target/hexagon/gen_tcg.h
> +++ b/target/hexagon/gen_tcg.h
> @@ -592,7 +592,8 @@
> #define fGEN_TCG_A5_ACS(SHORTCODE) \
> do { \
> gen_helper_vacsh_pred(PeV, cpu_env, RxxV, RssV, RttV); \
> - gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV); \
> + gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV, \
> + tcg_constant_tl(ctx->need_commit)); \
> } while (0)
>
> #define fGEN_TCG_S2_cabacdecbin(SHORTCODE) \
> diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
> index 75d0fc262d..420867f934 100644
> --- a/target/hexagon/genptr.h
> +++ b/target/hexagon/genptr.h
> @@ -58,4 +58,6 @@ void gen_set_half(int N, TCGv result, TCGv src);
> void gen_set_half_i64(int N, TCGv_i64 result, TCGv src);
> void probe_noshuf_load(TCGv va, int s, int mi);
>
> +extern const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS];
> +
> #endif
> diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
> index 73849e3d49..4b750d0351 100644
> --- a/target/hexagon/helper.h
> +++ b/target/hexagon/helper.h
> @@ -29,7 +29,7 @@ DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE,
> s32, s32, s32, s32, s32)
> DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32)
> DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
> DEF_HELPER_2(sfinvsqrta, i64, env, f32)
> -DEF_HELPER_4(vacsh_val, s64, env, s64, s64, s64)
> +DEF_HELPER_5(vacsh_val, s64, env, s64, s64, s64, i32)
> DEF_HELPER_FLAGS_4(vacsh_pred, TCG_CALL_NO_RWG_SE, s32, env, s64,
> s64, s64)
> DEF_HELPER_FLAGS_2(cabacdecbin_val, TCG_CALL_NO_RWG_SE, s64, s64,
> s64)
> DEF_HELPER_FLAGS_2(cabacdecbin_pred, TCG_CALL_NO_RWG_SE, s32, s64,
> s64)
> diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
> index 16e72ed0d5..a68446a367 100644
> --- a/target/hexagon/macros.h
> +++ b/target/hexagon/macros.h
> @@ -44,8 +44,17 @@
> reg_field_info[FIELD].offset)
>
> #define SET_USR_FIELD(FIELD, VAL) \
> - fINSERT_BITS(env->new_value[HEX_REG_USR], reg_field_info[FIELD].width,
> \
> - reg_field_info[FIELD].offset, (VAL))
> + do { \
> + if (pkt_need_commit) { \
> + fINSERT_BITS(env->new_value[HEX_REG_USR], \
> + reg_field_info[FIELD].width, \
> + reg_field_info[FIELD].offset, (VAL)); \
> + } else { \
> + fINSERT_BITS(env->gpr[HEX_REG_USR], \
> + reg_field_info[FIELD].width, \
> + reg_field_info[FIELD].offset, (VAL)); \
> + } \
> + } while (0)
> #endif
>
> #ifdef QEMU_GENERATE
> diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
> index f72228859f..3f6fd3452c 100644
> --- a/target/hexagon/translate.h
> +++ b/target/hexagon/translate.h
> @@ -62,10 +62,12 @@ typedef struct DisasContext {
> int qreg_log_idx;
> DECLARE_BITMAP(qregs_read, NUM_QREGS);
> bool pre_commit;
> + bool need_commit;
> TCGCond branch_cond;
> target_ulong branch_dest;
> bool is_tight_loop;
> bool need_pkt_has_store_s1;
> + bool short_circuit;
> } DisasContext;
>
> static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
> diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
> index da79b41c4d..d053d68487 100644
> --- a/target/hexagon/arch.c
> +++ b/target/hexagon/arch.c
> @@ -1,5 +1,5 @@
> /*
> - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights
> Reserved.
> + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights
> Reserved.
> *
> * This program is free software; you can redistribute it and/or modify
> * it under the terms of the GNU General Public License as published by
> @@ -224,6 +224,7 @@ void arch_fpop_start(CPUHexagonState *env)
>
> void arch_fpop_end(CPUHexagonState *env)
> {
> + const bool pkt_need_commit = true;
> int flags = get_float_exception_flags(&env->fp_status);
> if (flags != 0) {
> SOFTFLOAT_TEST_FLAG(float_flag_inexact, FPINPF, FPINPE);
> diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
> index ab40cfc283..4adf90dcfa 100644
> --- a/target/hexagon/cpu.c
> +++ b/target/hexagon/cpu.c
> @@ -1,5 +1,5 @@
> /*
> - * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights
> Reserved.
> + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights
> Reserved.
> *
> * This program is free software; you can redistribute it and/or modify
> * it under the terms of the GNU General Public License as published by
> @@ -52,6 +52,8 @@ static Property hexagon_lldb_compat_property =
> static Property hexagon_lldb_stack_adjust_property =
> DEFINE_PROP_UNSIGNED("lldb-stack-adjust", HexagonCPU,
> lldb_stack_adjust,
> 0, qdev_prop_uint32, target_ulong);
> +static Property hexagon_short_circuit_property =
> + DEFINE_PROP_BOOL("short-circuit", HexagonCPU, short_circuit, true);
>
> const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = {
> "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> @@ -328,6 +330,7 @@ static void hexagon_cpu_init(Object *obj)
> cpu_set_cpustate_pointers(cpu);
> qdev_property_add_static(DEVICE(obj), &hexagon_lldb_compat_property);
> qdev_property_add_static(DEVICE(obj),
> &hexagon_lldb_stack_adjust_property);
> + qdev_property_add_static(DEVICE(obj), &hexagon_short_circuit_property);
> }
>
> #include "hw/core/tcg-cpu-ops.h"
> diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
> index aff9ffe37b..5a0f6b5195 100644
> --- a/target/hexagon/genptr.c
> +++ b/target/hexagon/genptr.c
> @@ -45,7 +45,7 @@ TCGv gen_read_preg(TCGv pred, uint8_t num)
>
> #define IMMUTABLE (~0)
>
> -static const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] = {
> +const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] = {
> [HEX_REG_USR] = 0xc13000c0,
> [HEX_REG_PC] = IMMUTABLE,
> [HEX_REG_GP] = 0x3f,
> @@ -70,14 +70,18 @@ static inline void gen_masked_reg_write(TCGv
> new_val, TCGv cur_val,
>
> static TCGv get_result_gpr(DisasContext *ctx, int rnum)
> {
> - return hex_new_value[rnum];
> + if (ctx->need_commit) {
> + return hex_new_value[rnum];
> + } else {
> + return hex_gpr[rnum];
> + }
> }
>
> static TCGv_i64 get_result_gpr_pair(DisasContext *ctx, int rnum)
> {
> TCGv_i64 result = tcg_temp_new_i64();
> - tcg_gen_concat_i32_i64(result, hex_new_value[rnum],
> - hex_new_value[rnum + 1]);
> + tcg_gen_concat_i32_i64(result, get_result_gpr(ctx, rnum),
> + get_result_gpr(ctx, rnum + 1));
> return result;
> }
>
> @@ -86,7 +90,7 @@ void gen_log_reg_write(DisasContext *ctx, int rnum,
> TCGv val)
> const target_ulong reg_mask = reg_immut_masks[rnum];
>
> gen_masked_reg_write(val, hex_gpr[rnum], reg_mask);
> - tcg_gen_mov_tl(hex_new_value[rnum], val);
> + tcg_gen_mov_tl(get_result_gpr(ctx, rnum), val);
> if (HEX_DEBUG) {
> /* Do this so HELPER(debug_commit_end) will know */
> tcg_gen_movi_tl(hex_reg_written[rnum], 1);
> @@ -95,27 +99,15 @@ void gen_log_reg_write(DisasContext *ctx, int rnum,
> TCGv val)
>
> static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 val)
> {
> - const target_ulong reg_mask_low = reg_immut_masks[rnum];
> - const target_ulong reg_mask_high = reg_immut_masks[rnum + 1];
> TCGv val32 = tcg_temp_new();
>
> /* Low word */
> tcg_gen_extrl_i64_i32(val32, val);
> - gen_masked_reg_write(val32, hex_gpr[rnum], reg_mask_low);
> - tcg_gen_mov_tl(hex_new_value[rnum], val32);
> - if (HEX_DEBUG) {
> - /* Do this so HELPER(debug_commit_end) will know */
> - tcg_gen_movi_tl(hex_reg_written[rnum], 1);
> - }
> + gen_log_reg_write(ctx, rnum, val32);
>
> /* High word */
> tcg_gen_extrh_i64_i32(val32, val);
> - gen_masked_reg_write(val32, hex_gpr[rnum + 1], reg_mask_high);
> - tcg_gen_mov_tl(hex_new_value[rnum + 1], val32);
> - if (HEX_DEBUG) {
> - /* Do this so HELPER(debug_commit_end) will know */
> - tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1);
> - }
> + gen_log_reg_write(ctx, rnum + 1, val32);
> }
>
> void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
> diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
> index 46ccc59106..fc5c30a141 100644
> --- a/target/hexagon/op_helper.c
> +++ b/target/hexagon/op_helper.c
> @@ -220,7 +220,7 @@ void HELPER(debug_commit_end)(CPUHexagonState
> *env, int has_st0, int has_st1)
> reg_printed = true;
> }
> HEX_DEBUG_LOG("\tr%d = " TARGET_FMT_ld " (0x" TARGET_FMT_lx
> ")\n",
> - i, env->new_value[i], env->new_value[i]);
> + i, env->gpr[i], env->gpr[i]);
> }
> }
>
> @@ -352,7 +352,8 @@ uint64_t HELPER(sfinvsqrta)(CPUHexagonState *env,
> float32 RsV)
> }
>
> int64_t HELPER(vacsh_val)(CPUHexagonState *env,
> - int64_t RxxV, int64_t RssV, int64_t RttV)
> + int64_t RxxV, int64_t RssV, int64_t RttV,
> + uint32_t pkt_need_commit)
> {
> for (int i = 0; i < 4; i++) {
> int xv = sextract64(RxxV, i * 16, 16);
> diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
> index 023fc9be1e..5bd71bdcaf 100644
> --- a/target/hexagon/translate.c
> +++ b/target/hexagon/translate.c
> @@ -27,6 +27,7 @@
> #include "insn.h"
> #include "decode.h"
> #include "translate.h"
> +#include "genptr.h"
> #include "printinsn.h"
>
> #include "analyze_funcs_generated.c.inc"
> @@ -239,7 +240,7 @@ static int read_packet_words(CPUHexagonState *env,
> DisasContext *ctx,
> return nwords;
> }
>
> -static G_GNUC_UNUSED bool check_for_attrib(Packet *pkt, int attrib)
> +static bool check_for_attrib(Packet *pkt, int attrib)
> {
> for (int i = 0; i < pkt->num_insns; i++) {
> if (GET_ATTRIB(pkt->insn[i].opcode, attrib)) {
> @@ -336,6 +337,58 @@ static void mark_implicit_pred_writes(DisasContext
> *ctx)
> mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3);
> }
>
> +static bool pkt_raises_exception(Packet *pkt)
> +{
> + if (check_for_attrib(pkt, A_LOAD) ||
> + check_for_attrib(pkt, A_STORE)) {
> + return true;
> + }
> + return false;
> +}
> +
> +static bool need_commit(DisasContext *ctx)
> +{
> + Packet *pkt = ctx->pkt;
> +
> + /*
> + * If the short-circuit property is set to false, we'll always do the commit
> + */
> + if (!ctx->short_circuit) {
> + return true;
> + }
> +
> + if (pkt_raises_exception(pkt)) {
> + return true;
> + }
> +
> + /* Registers with immutability flags require new_value */
> + for (int i = 0; i < ctx->reg_log_idx; i++) {
> + int rnum = ctx->reg_log[i];
> + if (reg_immut_masks[rnum]) {
> + return true;
> + }
> + }
> +
> + /* Floating point instructions are hard-coded to use new_value */
> + if (check_for_attrib(pkt, A_FPOP)) {
> + return true;
> + }
> +
> + if (pkt->num_insns == 1) {
> + return false;
> + }
> +
> + /* Check for overlap between register reads and writes */
> + for (int i = 0; i < ctx->reg_log_idx; i++) {
> + int rnum = ctx->reg_log[i];
> + if (test_bit(rnum, ctx->regs_read)) {
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> static void mark_implicit_pred_read(DisasContext *ctx, int attrib, int pnum)
> {
> if (GET_ATTRIB(ctx->insn->opcode, attrib)) {
> @@ -365,6 +418,8 @@ static void analyze_packet(DisasContext *ctx)
> mark_implicit_pred_writes(ctx);
> mark_implicit_pred_reads(ctx);
> }
> +
> + ctx->need_commit = need_commit(ctx);
> }
>
> static void gen_start_packet(DisasContext *ctx)
> @@ -434,7 +489,8 @@ static void gen_start_packet(DisasContext *ctx)
> }
>
> /* Preload the predicated registers into hex_new_value[i] */
> - if (!bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) {
> + if (ctx->need_commit &&
> + !bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) {
> int i = find_first_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS);
> while (i < TOTAL_PER_THREAD_REGS) {
> tcg_gen_mov_tl(hex_new_value[i], hex_gpr[i]);
> @@ -541,6 +597,11 @@ static void gen_reg_writes(DisasContext *ctx)
> {
> int i;
>
> + /* Early exit if not needed */
> + if (!ctx->need_commit) {
> + return;
> + }
> +
> for (i = 0; i < ctx->reg_log_idx; i++) {
> int reg_num = ctx->reg_log[i];
>
> @@ -919,6 +980,7 @@ static void
> hexagon_tr_init_disas_context(DisasContextBase *dcbase,
> CPUState *cs)
> {
> DisasContext *ctx = container_of(dcbase, DisasContext, base);
> + HexagonCPU *hex_cpu = env_archcpu(cs->env_ptr);
> uint32_t hex_flags = dcbase->tb->flags;
>
> ctx->mem_idx = MMU_USER_IDX;
> @@ -927,6 +989,7 @@ static void
> hexagon_tr_init_disas_context(DisasContextBase *dcbase,
> ctx->num_hvx_insns = 0;
> ctx->branch_cond = TCG_COND_NEVER;
> ctx->is_tight_loop = FIELD_EX32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP);
> + ctx->short_circuit = hex_cpu->short_circuit;
> }
>
> static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu)
> diff --git a/target/hexagon/gen_helper_funcs.py
> b/target/hexagon/gen_helper_funcs.py
> index c73d792580..e259ea3d03 100755
> --- a/target/hexagon/gen_helper_funcs.py
> +++ b/target/hexagon/gen_helper_funcs.py
> @@ -287,6 +287,8 @@ def gen_helper_function(f, tag, tagregs, tagimms):
>
> if hex_common.need_pkt_has_multi_cof(tag):
> f.write(", uint32_t pkt_has_multi_cof")
> + if (hex_common.need_pkt_need_commit(tag)):
> + f.write(", uint32_t pkt_need_commit")
>
> if hex_common.need_PC(tag):
> if i > 0:
> diff --git a/target/hexagon/gen_helper_protos.py
> b/target/hexagon/gen_helper_protos.py
> index 187cd6e04e..c5ecb85294 100755
> --- a/target/hexagon/gen_helper_protos.py
> +++ b/target/hexagon/gen_helper_protos.py
> @@ -86,6 +86,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
> def_helper_size = len(regs) + len(imms) + numscalarreadwrite + 1
> if hex_common.need_pkt_has_multi_cof(tag):
> def_helper_size += 1
> + if hex_common.need_pkt_need_commit(tag):
> + def_helper_size += 1
> if hex_common.need_part1(tag):
> def_helper_size += 1
> if hex_common.need_slot(tag):
> @@ -103,6 +105,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
> def_helper_size = len(regs) + len(imms) + numscalarreadwrite
> if hex_common.need_pkt_has_multi_cof(tag):
> def_helper_size += 1
> + if hex_common.need_pkt_need_commit(tag):
> + def_helper_size += 1
> if hex_common.need_part1(tag):
> def_helper_size += 1
> if hex_common.need_slot(tag):
> @@ -156,10 +160,12 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
> for immlett, bits, immshift in imms:
> f.write(", s32")
>
> - ## Add the arguments for the instruction pkt_has_multi_cof, slot and
> - ## part1 (if needed)
> + ## Add the arguments for the instruction pkt_has_multi_cof,
> + ## pkt_needs_commit, PC, next_PC, slot, and part1 (if needed)
> if hex_common.need_pkt_has_multi_cof(tag):
> f.write(", i32")
> + if hex_common.need_pkt_need_commit(tag):
> + f.write(', i32')
> if hex_common.need_PC(tag):
> f.write(", i32")
> if hex_common.helper_needs_next_PC(tag):
> diff --git a/target/hexagon/gen_idef_parser_funcs.py
> b/target/hexagon/gen_idef_parser_funcs.py
> index afe68bdb6f..b7f2df0f36 100644
> --- a/target/hexagon/gen_idef_parser_funcs.py
> +++ b/target/hexagon/gen_idef_parser_funcs.py
> @@ -109,6 +109,13 @@ def main():
> continue
> if "A_COF" in hex_common.attribdict[tag]:
> continue
> + ## Skip instructions that are incompatible with short-circuit
> + ## packet register writes
> + if ( tag == 'S2_insert' or
> + tag == 'S2_insert_rp' or
> + tag == 'S2_asr_r_svw_trun' or
> + tag == 'A2_swiz' ):
> + continue
>
> regs = tagregs[tag]
> imms = tagimms[tag]
> diff --git a/target/hexagon/gen_tcg_funcs.py
> b/target/hexagon/gen_tcg_funcs.py
> index d9ccbe63f6..0e45d43685 100755
> --- a/target/hexagon/gen_tcg_funcs.py
> +++ b/target/hexagon/gen_tcg_funcs.py
> @@ -550,6 +550,9 @@ def gen_tcg_func(f, tag, regs, imms):
> if hex_common.need_pkt_has_multi_cof(tag):
> f.write(" TCGv pkt_has_multi_cof = ")
> f.write("tcg_constant_tl(ctx->pkt->pkt_has_multi_cof);\n")
> + if hex_common.need_pkt_need_commit(tag):
> + f.write(" TCGv pkt_need_commit = ")
> + f.write("tcg_constant_tl(ctx->need_commit);\n")
> if hex_common.need_part1(tag):
> f.write(" TCGv part1 = tcg_constant_tl(insn->part1);\n")
> if hex_common.need_slot(tag):
> @@ -596,6 +599,8 @@ def gen_tcg_func(f, tag, regs, imms):
>
> if hex_common.need_pkt_has_multi_cof(tag):
> f.write(", pkt_has_multi_cof")
> + if hex_common.need_pkt_need_commit(tag):
> + f.write(", pkt_need_commit")
> if hex_common.need_PC(tag):
> f.write(", PC")
> if hex_common.helper_needs_next_PC(tag):
> diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
> index 232c6e2c20..29c0508f66 100755
> --- a/target/hexagon/hex_common.py
> +++ b/target/hexagon/hex_common.py
> @@ -276,6 +276,9 @@ def need_pkt_has_multi_cof(tag):
> return "A_COF" in attribdict[tag]
>
>
> +def need_pkt_need_commit(tag):
> + return 'A_IMPLICIT_WRITES_USR' in attribdict[tag]
> +
> def need_condexec_reg(tag, regs):
> if "A_CONDEXEC" in attribdict[tag]:
> for regtype, regid, toss, numregs in regs:
> --
> 2.25.1
Reviewed-by: Brian Cain <bcain@quicinc.com>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER
2023-04-27 22:59 ` [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER Taylor Simpson
@ 2023-04-28 8:01 ` Richard Henderson
0 siblings, 0 replies; 26+ messages in thread
From: Richard Henderson @ 2023-04-28 8:01 UTC (permalink / raw)
To: Taylor Simpson, qemu-devel; +Cc: philmd, ale, anjo, bcain, quic_mathbern
On 4/27/23 23:59, Taylor Simpson wrote:
> Enable conditional compilation depending on whether idef-parser
> is configured
>
> Signed-off-by: Taylor Simpson<tsimpson@quicinc.com>
> ---
> meson.build | 1 +
> 1 file changed, 1 insertion(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 05/21] Hexagon (target/hexagon) Add overrides for clr[tf]new
2023-04-27 22:59 ` [PATCH v2 05/21] Hexagon (target/hexagon) Add overrides for clr[tf]new Taylor Simpson
@ 2023-04-28 8:04 ` Richard Henderson
0 siblings, 0 replies; 26+ messages in thread
From: Richard Henderson @ 2023-04-28 8:04 UTC (permalink / raw)
To: Taylor Simpson, qemu-devel; +Cc: philmd, ale, anjo, bcain, quic_mathbern
On 4/27/23 23:59, Taylor Simpson wrote:
> These instructions have implicit reads from p0, so we don't want
> them in helpers when idef-parser is off.
>
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
> target/hexagon/gen_tcg.h | 16 ++++++++++++++++
> target/hexagon/macros.h | 4 ----
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
> index 7c5cb93297..f3e9c280b0 100644
> --- a/target/hexagon/gen_tcg.h
> +++ b/target/hexagon/gen_tcg.h
> @@ -1097,6 +1097,22 @@
> gen_jump(ctx, riV); \
> } while (0)
>
> +/* if (p0.new) r0 = #0 */
> +#define fGEN_TCG_SA1_clrtnew(SHORTCODE) \
> + do { \
> + tcg_gen_movcond_tl(TCG_COND_EQ, RdV, \
> + hex_new_pred_value[0], tcg_constant_tl(0), \
> + RdV, tcg_constant_tl(0)); \
> + } while (0)
> +
> +/* if (!p0.new) r0 = #0 */
> +#define fGEN_TCG_SA1_clrfnew(SHORTCODE) \
> + do { \
> + tcg_gen_movcond_tl(TCG_COND_NE, RdV, \
> + hex_new_pred_value[0], tcg_constant_tl(0), \
> + RdV, tcg_constant_tl(0)); \
> + } while (0)
> +
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
> #define fGEN_TCG_J2_pause(SHORTCODE) \
> do { \
> uiV = uiV; \
> diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
> index 3e162de3a7..2cb0647ce2 100644
> --- a/target/hexagon/macros.h
> +++ b/target/hexagon/macros.h
> @@ -227,12 +227,8 @@ static inline void gen_cancel(uint32_t slot)
>
> #ifdef QEMU_GENERATE
> #define fLSBNEW(PVAL) tcg_gen_andi_tl(LSB, (PVAL), 1)
> -#define fLSBNEW0 tcg_gen_andi_tl(LSB, hex_new_pred_value[0], 1)
> -#define fLSBNEW1 tcg_gen_andi_tl(LSB, hex_new_pred_value[1], 1)
> #else
> #define fLSBNEW(PVAL) ((PVAL) & 1)
> -#define fLSBNEW0 (env->new_pred_value[0] & 1)
> -#define fLSBNEW1 (env->new_pred_value[1] & 1)
> #endif
>
> #ifdef QEMU_GENERATE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 14/21] Hexagon (target/hexagon) Short-circuit more HVX single instruction packets
2023-04-27 23:00 ` [PATCH v2 14/21] Hexagon (target/hexagon) Short-circuit more HVX single instruction packets Taylor Simpson
@ 2023-04-28 8:09 ` Richard Henderson
0 siblings, 0 replies; 26+ messages in thread
From: Richard Henderson @ 2023-04-28 8:09 UTC (permalink / raw)
To: Taylor Simpson, qemu-devel; +Cc: philmd, ale, anjo, bcain, quic_mathbern
On 4/28/23 00:00, Taylor Simpson wrote:
> The generated helpers for HVX use pass-by-reference, so they can't
> short-circuit when the reads/writes overlap. The instructions with
> overrides are OK because they use tcg_gen_gvec_*.
>
> We add a flag has_hvx_helper to DisasContext and extend gen_analyze_funcs
> to set the flag when the instruction is an HVX instruction with a
> generated helper.
>
> We add an override for V6_vcombine so that it can be short-circuited
> along with a test case in tests/tcg/hexagon/hvx_misc.c
>
> Signed-off-by: Taylor Simpson<tsimpson@quicinc.com>
> ---
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2023-04-28 8:10 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-27 22:59 [PATCH v2 00/21] Hexagon (target/hexagon) short-circuit and move to DisasContext Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 01/21] meson.build Add CONFIG_HEXAGON_IDEF_PARSER Taylor Simpson
2023-04-28 8:01 ` Richard Henderson
2023-04-27 22:59 ` [PATCH v2 02/21] Hexagon (target/hexagon) Add DisasContext arg to gen_log_reg_write Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 03/21] Hexagon (target/hexagon) Add overrides for loop setup instructions Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 04/21] Hexagon (target/hexagon) Add overrides for allocframe/deallocframe Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 05/21] Hexagon (target/hexagon) Add overrides for clr[tf]new Taylor Simpson
2023-04-28 8:04 ` Richard Henderson
2023-04-27 22:59 ` [PATCH v2 06/21] Hexagon (target/hexagon) Remove log_reg_write from op_helper.[ch] Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 07/21] Hexagon (target/hexagon) Eliminate uses of log_pred_write function Taylor Simpson
2023-04-27 22:59 ` [PATCH v2 08/21] Hexagon (target/hexagon) Clean up pred_written usage Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 09/21] Hexagon (target/hexagon) Don't overlap dest writes with source reads Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 10/21] Hexagon (target/hexagon) Mark registers as read during packet analysis Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 11/21] Hexagon (target/hexagon) Short-circuit packet register writes Taylor Simpson
2023-04-28 2:47 ` Brian Cain
2023-04-27 23:00 ` [PATCH v2 12/21] Hexagon (target/hexagon) Short-circuit packet predicate writes Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 13/21] Hexagon (target/hexagon) Short-circuit packet HVX writes Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 14/21] Hexagon (target/hexagon) Short-circuit more HVX single instruction packets Taylor Simpson
2023-04-28 8:09 ` Richard Henderson
2023-04-27 23:00 ` [PATCH v2 15/21] Hexagon (target/hexagon) Add overrides for disabled idef-parser insns Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 16/21] Hexagon (target/hexagon) Make special new_value for USR Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 17/21] Hexagon (target/hexagon) Move new_value to DisasContext Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 18/21] Hexagon (target/hexagon) Move new_pred_value " Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 19/21] Hexagon (target/hexagon) Move pred_written " Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 20/21] Hexagon (target/hexagon) Move pkt_has_store_s1 " Taylor Simpson
2023-04-27 23:00 ` [PATCH v2 21/21] Hexagon (target/hexagon) Move items " Taylor Simpson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).