* [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format
@ 2015-01-26 16:29 Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 1/4] target-tricore: target-tricore: Add instructions of RR1 opcode format, that have 0x93 as first opcode Bastian Koppelmann
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Bastian Koppelmann @ 2015-01-26 16:29 UTC (permalink / raw)
To: qemu-devel; +Cc: rth
Hi,
this is a rather short patchset, that only implements instructions of four
formats. There will be another patchset, which has a few bugfixes.
Cheers,
Bastian
v1 -> v2:
- Add 3 helper functions (gen_mul_q, gen_mul_q_16, gen_mulr_q) to
remove repetition.
- gen_mul_q now uses 64 arithmetic instead of emulating it.
- MUL_Q now uses arithmetic shift, instead of normal shift + sign extend for arg
extraction.
- optimize OPC2_32_RRPW_EXTR by using only two shifts, instead of four.
- OPC1_32_RRPW_DEXTR now has r1 == r2 as a special case.
Bastian Koppelmann (4):
target-tricore: target-tricore: Add instructions of RR1 opcode format,
that have 0x93 as first opcode
target-tricore: Add instructions of RR2 opcode format
target-tricore: Add instructions of RRPW opcode format
target-tricore: Add instructions of RRR opcode format
target-tricore/helper.h | 8 +
target-tricore/op_helper.c | 160 ++++++++++++++
target-tricore/translate.c | 439 +++++++++++++++++++++++++++++++++++++++
target-tricore/tricore-opcodes.h | 2 +-
4 files changed, 608 insertions(+), 1 deletion(-)
--
2.2.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH v2 1/4] target-tricore: target-tricore: Add instructions of RR1 opcode format, that have 0x93 as first opcode
2015-01-26 16:29 [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format Bastian Koppelmann
@ 2015-01-26 16:29 ` Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 2/4] target-tricore: Add instructions of RR2 opcode format Bastian Koppelmann
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Bastian Koppelmann @ 2015-01-26 16:29 UTC (permalink / raw)
To: qemu-devel; +Cc: rth
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
---
v1 -> v2:
- now uses 3 helper functions (gen_mul_q, gen_mul_q_16, gen_mulr_q) to
remove repetition.
- now uses 64 arithmetic instead of emulating it.
- now uses arithmetic shift, instead of normal shift + sign extend for arg
extraction.
target-tricore/translate.c | 182 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 182 insertions(+)
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index def7f4a..804d181 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -987,6 +987,119 @@ static inline void gen_maddsui_32(TCGv ret, TCGv r1, TCGv r2, int32_t con)
tcg_temp_free(temp);
}
+static void
+gen_mul_q(TCGv rl, TCGv rh, TCGv arg1, TCGv arg2, uint32_t n, uint32_t up_shift)
+{
+ TCGv temp = tcg_temp_new();
+ TCGv_i64 temp_64 = tcg_temp_new_i64();
+ TCGv_i64 temp2_64 = tcg_temp_new_i64();
+
+ if (n == 0) {
+ if ((up_shift == 32)) {
+ tcg_gen_muls2_tl(rh, rl, arg1, arg2);
+ } else if (up_shift == 16) {
+ tcg_gen_ext_i32_i64(temp_64, arg1);
+ tcg_gen_ext_i32_i64(temp2_64, arg2);
+
+ tcg_gen_mul_i64(temp_64, temp_64, temp2_64);
+ tcg_gen_shri_i64(temp_64, temp_64, up_shift);
+ tcg_gen_extr_i64_i32(rl, rh, temp_64);
+ } else {
+ tcg_gen_muls2_tl(rl, rh, arg1, arg2);
+ }
+ /* reset v bit */
+ tcg_gen_movi_tl(cpu_PSW_V, 0);
+ } else { /* n is exspected to be 1 */
+ tcg_gen_ext_i32_i64(temp_64, arg1);
+ tcg_gen_ext_i32_i64(temp2_64, arg2);
+
+ tcg_gen_mul_i64(temp_64, temp_64, temp2_64);
+
+ if (up_shift == 0) {
+ tcg_gen_shli_i64(temp_64, temp_64, 1);
+ } else {
+ tcg_gen_shri_i64(temp_64, temp_64, up_shift - 1);
+ }
+ tcg_gen_extr_i64_i32(rl, rh, temp_64);
+ /* overflow only occours if r1 = r2 = 0x8000 */
+ if (up_shift == 0) {/* result is 64 bit */
+ tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_PSW_V, rh,
+ 0x80000000);
+ } else { /* result is 32 bit */
+ tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_PSW_V, rl,
+ 0x80000000);
+ }
+ tcg_gen_shli_tl(cpu_PSW_V, cpu_PSW_V, 31);
+ /* calc sv overflow bit */
+ tcg_gen_or_tl(cpu_PSW_SV, cpu_PSW_SV, cpu_PSW_V);
+ }
+ /* calc av overflow bit */
+ if (up_shift == 0) {
+ tcg_gen_add_tl(cpu_PSW_AV, rh, rh);
+ tcg_gen_xor_tl(cpu_PSW_AV, rh, cpu_PSW_AV);
+ } else {
+ tcg_gen_add_tl(cpu_PSW_AV, rl, rl);
+ tcg_gen_xor_tl(cpu_PSW_AV, rl, cpu_PSW_AV);
+ }
+ /* calc sav overflow bit */
+ tcg_gen_or_tl(cpu_PSW_SAV, cpu_PSW_SAV, cpu_PSW_AV);
+ tcg_temp_free(temp);
+ tcg_temp_free_i64(temp_64);
+ tcg_temp_free_i64(temp2_64);
+}
+
+static void
+gen_mul_q_16(TCGv ret, TCGv arg1, TCGv arg2, uint32_t n)
+{
+ TCGv temp = tcg_temp_new();
+ if (n == 0) {
+ tcg_gen_mul_tl(ret, arg1, arg2);
+ } else { /* n is exspected to be 1 */
+ tcg_gen_mul_tl(ret, arg1, arg2);
+ tcg_gen_shli_tl(ret, ret, 1);
+ /* catch special case r1 = r2 = 0x8000 */
+ tcg_gen_setcondi_tl(TCG_COND_EQ, temp, ret, 0x80000000);
+ tcg_gen_sub_tl(ret, ret, temp);
+ }
+ /* reset v bit */
+ tcg_gen_movi_tl(cpu_PSW_V, 0);
+ /* calc av overflow bit */
+ tcg_gen_add_tl(cpu_PSW_AV, ret, ret);
+ tcg_gen_xor_tl(cpu_PSW_AV, ret, cpu_PSW_AV);
+ /* calc sav overflow bit */
+ tcg_gen_or_tl(cpu_PSW_SAV, cpu_PSW_SAV, cpu_PSW_AV);
+
+ tcg_temp_free(temp);
+}
+
+static void gen_mulr_q(TCGv ret, TCGv arg1, TCGv arg2, uint32_t n)
+{
+ TCGv temp = tcg_temp_new();
+ if (n == 0) {
+ tcg_gen_mul_tl(ret, arg1, arg2);
+ tcg_gen_addi_tl(ret, ret, 0x8000);
+ } else {
+ tcg_gen_mul_tl(ret, arg1, arg2);
+ tcg_gen_shli_tl(ret, ret, 1);
+ tcg_gen_addi_tl(ret, ret, 0x8000);
+ /* catch special case r1 = r2 = 0x8000 */
+ tcg_gen_setcondi_tl(TCG_COND_EQ, temp, ret, 0x80008000);
+ tcg_gen_muli_tl(temp, temp, 0x8001);
+ tcg_gen_sub_tl(ret, ret, temp);
+ }
+ /* reset v bit */
+ tcg_gen_movi_tl(cpu_PSW_V, 0);
+ /* calc av overflow bit */
+ tcg_gen_add_tl(cpu_PSW_AV, ret, ret);
+ tcg_gen_xor_tl(cpu_PSW_AV, ret, cpu_PSW_AV);
+ /* calc sav overflow bit */
+ tcg_gen_or_tl(cpu_PSW_SAV, cpu_PSW_SAV, cpu_PSW_AV);
+ /* cut halfword off */
+ tcg_gen_andi_tl(ret, ret, 0xffff0000);
+
+ tcg_temp_free(temp);
+}
+
static inline void
gen_maddsi_64(TCGv ret_low, TCGv ret_high, TCGv r1, TCGv r2_low, TCGv r2_high,
int32_t con)
@@ -4778,6 +4891,72 @@ static void decode_rr1_mul(CPUTriCoreState *env, DisasContext *ctx)
tcg_temp_free(n);
}
+static void decode_rr1_mulq(CPUTriCoreState *env, DisasContext *ctx)
+{
+ uint32_t op2;
+ int r1, r2, r3;
+ uint32_t n;
+
+ TCGv temp, temp2;
+
+ r1 = MASK_OP_RR1_S1(ctx->opcode);
+ r2 = MASK_OP_RR1_S2(ctx->opcode);
+ r3 = MASK_OP_RR1_D(ctx->opcode);
+ n = MASK_OP_RR1_N(ctx->opcode);
+ op2 = MASK_OP_RR1_OP2(ctx->opcode);
+
+ temp = tcg_temp_new();
+ temp2 = tcg_temp_new();
+
+ switch (op2) {
+ case OPC2_32_RR1_MUL_Q_32:
+ gen_mul_q(cpu_gpr_d[r3], temp, cpu_gpr_d[r1], cpu_gpr_d[r2], n, 32);
+ break;
+ case OPC2_32_RR1_MUL_Q_64:
+ gen_mul_q(cpu_gpr_d[r3], cpu_gpr_d[r3+1], cpu_gpr_d[r1], cpu_gpr_d[r2],
+ n, 0);
+ break;
+ case OPC2_32_RR1_MUL_Q_32_L:
+ tcg_gen_ext16s_tl(temp, cpu_gpr_d[r2]);
+ gen_mul_q(cpu_gpr_d[r3], temp, cpu_gpr_d[r1], temp, n, 16);
+ break;
+ case OPC2_32_RR1_MUL_Q_64_L:
+ tcg_gen_ext16s_tl(temp, cpu_gpr_d[r2]);
+ gen_mul_q(cpu_gpr_d[r3], cpu_gpr_d[r3+1], cpu_gpr_d[r1], temp, n, 0);
+ break;
+ case OPC2_32_RR1_MUL_Q_32_U:
+ tcg_gen_sari_tl(temp, cpu_gpr_d[r2], 16);
+ gen_mul_q(cpu_gpr_d[r3], temp, cpu_gpr_d[r1], temp, n, 16);
+ break;
+ case OPC2_32_RR1_MUL_Q_64_U:
+ tcg_gen_sari_tl(temp, cpu_gpr_d[r2], 16);
+ gen_mul_q(cpu_gpr_d[r3], cpu_gpr_d[r3+1], cpu_gpr_d[r1], temp, n, 0);
+ break;
+ case OPC2_32_RR1_MUL_Q_32_LL:
+ tcg_gen_ext16s_tl(temp, cpu_gpr_d[r1]);
+ tcg_gen_ext16s_tl(temp2, cpu_gpr_d[r2]);
+ gen_mul_q_16(cpu_gpr_d[r3], temp, temp2, n);
+ break;
+ case OPC2_32_RR1_MUL_Q_32_UU:
+ tcg_gen_sari_tl(temp, cpu_gpr_d[r1], 16);
+ tcg_gen_sari_tl(temp2, cpu_gpr_d[r2], 16);
+ gen_mul_q_16(cpu_gpr_d[r3], temp, temp2, n);
+ break;
+ case OPC2_32_RR1_MULR_Q_32_L:
+ tcg_gen_ext16s_tl(temp, cpu_gpr_d[r1]);
+ tcg_gen_ext16s_tl(temp2, cpu_gpr_d[r2]);
+ gen_mulr_q(cpu_gpr_d[r3], temp, temp2, n);
+ break;
+ case OPC2_32_RR1_MULR_Q_32_U:
+ tcg_gen_sari_tl(temp, cpu_gpr_d[r1], 16);
+ tcg_gen_sari_tl(temp2, cpu_gpr_d[r2], 16);
+ gen_mulr_q(cpu_gpr_d[r3], temp, temp2, n);
+ break;
+ }
+ tcg_temp_free(temp);
+ tcg_temp_free(temp2);
+}
+
static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
{
int op1;
@@ -5035,6 +5214,9 @@ static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
case OPCM_32_RR1_MUL:
decode_rr1_mul(env, ctx);
break;
+ case OPCM_32_RR1_MULQ:
+ decode_rr1_mulq(env, ctx);
+ break;
}
}
--
2.2.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH v2 2/4] target-tricore: Add instructions of RR2 opcode format
2015-01-26 16:29 [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 1/4] target-tricore: target-tricore: Add instructions of RR1 opcode format, that have 0x93 as first opcode Bastian Koppelmann
@ 2015-01-26 16:29 ` Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 3/4] target-tricore: Add instructions of RRPW " Bastian Koppelmann
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Bastian Koppelmann @ 2015-01-26 16:29 UTC (permalink / raw)
To: qemu-devel; +Cc: rth
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
---
target-tricore/translate.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 804d181..8f9679e 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -4957,6 +4957,39 @@ static void decode_rr1_mulq(CPUTriCoreState *env, DisasContext *ctx)
tcg_temp_free(temp2);
}
+/* RR2 format */
+static void decode_rr2_mul(CPUTriCoreState *env, DisasContext *ctx)
+{
+ uint32_t op2;
+ int r1, r2, r3;
+
+ op2 = MASK_OP_RR2_OP2(ctx->opcode);
+ r1 = MASK_OP_RR2_S1(ctx->opcode);
+ r2 = MASK_OP_RR2_S2(ctx->opcode);
+ r3 = MASK_OP_RR2_D(ctx->opcode);
+ switch (op2) {
+ case OPC2_32_RR2_MUL_32:
+ gen_mul_i32s(cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RR2_MUL_64:
+ gen_mul_i64s(cpu_gpr_d[r3], cpu_gpr_d[r3+1], cpu_gpr_d[r1],
+ cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RR2_MULS_32:
+ gen_helper_mul_ssov(cpu_gpr_d[r3], cpu_env, cpu_gpr_d[r1],
+ cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RR2_MUL_U_64:
+ gen_mul_i64u(cpu_gpr_d[r3], cpu_gpr_d[r3+1], cpu_gpr_d[r1],
+ cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RR2_MULS_U_32:
+ gen_helper_mul_suov(cpu_gpr_d[r3], cpu_env, cpu_gpr_d[r1],
+ cpu_gpr_d[r2]);
+ break;
+ }
+}
+
static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
{
int op1;
@@ -5217,6 +5250,10 @@ static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
case OPCM_32_RR1_MULQ:
decode_rr1_mulq(env, ctx);
break;
+/* RR2 format */
+ case OPCM_32_RR2_MUL:
+ decode_rr2_mul(env, ctx);
+ break;
}
}
--
2.2.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH v2 3/4] target-tricore: Add instructions of RRPW opcode format
2015-01-26 16:29 [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 1/4] target-tricore: target-tricore: Add instructions of RR1 opcode format, that have 0x93 as first opcode Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 2/4] target-tricore: Add instructions of RR2 opcode format Bastian Koppelmann
@ 2015-01-26 16:29 ` Bastian Koppelmann
2015-01-26 16:30 ` [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RRR " Bastian Koppelmann
2015-01-26 18:11 ` [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and " Richard Henderson
4 siblings, 0 replies; 6+ messages in thread
From: Bastian Koppelmann @ 2015-01-26 16:29 UTC (permalink / raw)
To: qemu-devel; +Cc: rth
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
---
v1 -> v2:
- optimize OPC2_32_RRPW_EXTR by using only two shifts, instead of four.
- OPC1_32_RRPW_DEXTR now has r1 == r2 as a special case.
target-tricore/translate.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 8f9679e..aa70f61 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -4990,6 +4990,57 @@ static void decode_rr2_mul(CPUTriCoreState *env, DisasContext *ctx)
}
}
+/* RRPW format */
+static void decode_rrpw_extract_insert(CPUTriCoreState *env, DisasContext *ctx)
+{
+ uint32_t op2;
+ int r1, r2, r3;
+ int32_t pos, width;
+
+ op2 = MASK_OP_RRPW_OP2(ctx->opcode);
+ r1 = MASK_OP_RRPW_S1(ctx->opcode);
+ r2 = MASK_OP_RRPW_S2(ctx->opcode);
+ r3 = MASK_OP_RRPW_D(ctx->opcode);
+ pos = MASK_OP_RRPW_POS(ctx->opcode);
+ width = MASK_OP_RRPW_WIDTH(ctx->opcode);
+
+ switch (op2) {
+ case OPC2_32_RRPW_EXTR:
+ if (pos + width <= 31) {
+ /* optimize special cases */
+ if ((pos == 0) && (width == 8)){
+ tcg_gen_ext8s_tl(cpu_gpr_d[r3], cpu_gpr_d[r1]);
+ } else if ((pos == 0) && (width == 16)) {
+ tcg_gen_ext16s_tl(cpu_gpr_d[r3], cpu_gpr_d[r1]);
+ } else {
+ tcg_gen_shli_tl(cpu_gpr_d[r3], cpu_gpr_d[r1], 32 - pos - width);
+ tcg_gen_sari_tl(cpu_gpr_d[r3], cpu_gpr_d[r3], 32 - width);
+ }
+ }
+ break;
+ case OPC2_32_RRPW_EXTR_U:
+ if (width == 0) {
+ tcg_gen_movi_tl(cpu_gpr_d[r3], 0);
+ } else {
+ tcg_gen_shri_tl(cpu_gpr_d[r3], cpu_gpr_d[r1], pos);
+ tcg_gen_andi_tl(cpu_gpr_d[r3], cpu_gpr_d[r3], ~0u >> (32-width));
+ }
+ break;
+ case OPC2_32_RRPW_IMASK:
+ if (pos + width <= 31) {
+ tcg_gen_movi_tl(cpu_gpr_d[r3+1], ((1u << width) - 1) << pos);
+ tcg_gen_shli_tl(cpu_gpr_d[r3], cpu_gpr_d[r2], pos);
+ }
+ break;
+ case OPC2_32_RRPW_INSERT:
+ if (pos + width <= 31) {
+ tcg_gen_deposit_tl(cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2],
+ width, pos);
+ }
+ break;
+ }
+}
+
static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
{
int op1;
@@ -5254,6 +5305,25 @@ static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
case OPCM_32_RR2_MUL:
decode_rr2_mul(env, ctx);
break;
+/* RRPW format */
+ case OPCM_32_RRPW_EXTRACT_INSERT:
+ decode_rrpw_extract_insert(env, ctx);
+ break;
+ case OPC1_32_RRPW_DEXTR:
+ r1 = MASK_OP_RRPW_S1(ctx->opcode);
+ r2 = MASK_OP_RRPW_S2(ctx->opcode);
+ r3 = MASK_OP_RRPW_D(ctx->opcode);
+ const16 = MASK_OP_RRPW_POS(ctx->opcode);
+ if (r1 == r2) {
+ tcg_gen_rotli_tl(cpu_gpr_d[r3], cpu_gpr_d[r1], const16);
+ } else {
+ temp = tcg_temp_new();
+ tcg_gen_shli_tl(cpu_gpr_d[r3], cpu_gpr_d[r2], const16);
+ tcg_gen_shri_tl(temp, cpu_gpr_d[r1], 32 - const16);
+ tcg_gen_or_tl(cpu_gpr_d[r3], cpu_gpr_d[r3], temp);
+ tcg_temp_free(temp);
+ }
+ break;
}
}
--
2.2.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RRR opcode format
2015-01-26 16:29 [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format Bastian Koppelmann
` (2 preceding siblings ...)
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 3/4] target-tricore: Add instructions of RRPW " Bastian Koppelmann
@ 2015-01-26 16:30 ` Bastian Koppelmann
2015-01-26 18:11 ` [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and " Richard Henderson
4 siblings, 0 replies; 6+ messages in thread
From: Bastian Koppelmann @ 2015-01-26 16:30 UTC (permalink / raw)
To: qemu-devel; +Cc: rth
Add microcode generator function gen_cond_sub.
Add helper functions:
* ixmax/ixmin: search for the max/min value and its related index in a
vector of 16-bit values.
* pack: dack two data registers into an IEEE-754 single precision floating
point format number.
* dvadj: divide-adjust the result after dvstep instructions.
* dvstep: divide a reg by a divisor, producing 8-bits of quotient at a time.
OPCM_32_RRR_FLOAT -> OPCM_32_RRR_DIVIDE
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-tricore/helper.h | 8 ++
target-tricore/op_helper.c | 160 +++++++++++++++++++++++++++++++++++++++
target-tricore/translate.c | 150 ++++++++++++++++++++++++++++++++++++
target-tricore/tricore-opcodes.h | 2 +-
4 files changed, 319 insertions(+), 1 deletion(-)
diff --git a/target-tricore/helper.h b/target-tricore/helper.h
index 068dc7b..7405fee 100644
--- a/target-tricore/helper.h
+++ b/target-tricore/helper.h
@@ -60,10 +60,14 @@ DEF_HELPER_FLAGS_2(max_b, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(max_bu, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(max_h, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(max_hu, TCG_CALL_NO_RWG_SE, i32, i32, i32)
+DEF_HELPER_FLAGS_2(ixmax, TCG_CALL_NO_RWG_SE, i64, i64, i32)
+DEF_HELPER_FLAGS_2(ixmax_u, TCG_CALL_NO_RWG_SE, i64, i64, i32)
DEF_HELPER_FLAGS_2(min_b, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(min_bu, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(min_h, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(min_hu, TCG_CALL_NO_RWG_SE, i32, i32, i32)
+DEF_HELPER_FLAGS_2(ixmin, TCG_CALL_NO_RWG_SE, i64, i64, i32)
+DEF_HELPER_FLAGS_2(ixmin_u, TCG_CALL_NO_RWG_SE, i64, i64, i32)
/* count leading ... */
DEF_HELPER_FLAGS_1(clo, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_FLAGS_1(clo_h, TCG_CALL_NO_RWG_SE, i32, i32)
@@ -81,12 +85,16 @@ DEF_HELPER_FLAGS_2(bmerge, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_1(bsplit, TCG_CALL_NO_RWG_SE, i64, i32)
DEF_HELPER_FLAGS_1(parity, TCG_CALL_NO_RWG_SE, i32, i32)
/* float */
+DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32, i32)
DEF_HELPER_1(unpack, i64, i32)
/* dvinit */
DEF_HELPER_3(dvinit_b_13, i64, env, i32, i32)
DEF_HELPER_3(dvinit_b_131, i64, env, i32, i32)
DEF_HELPER_3(dvinit_h_13, i64, env, i32, i32)
DEF_HELPER_3(dvinit_h_131, i64, env, i32, i32)
+DEF_HELPER_FLAGS_2(dvadj, TCG_CALL_NO_RWG_SE, i64, i64, i32)
+DEF_HELPER_FLAGS_2(dvstep, TCG_CALL_NO_RWG_SE, i64, i64, i32)
+DEF_HELPER_FLAGS_2(dvstep_u, TCG_CALL_NO_RWG_SE, i64, i64, i32)
/* mulh */
DEF_HELPER_FLAGS_5(mul_h, TCG_CALL_NO_RWG_SE, i64, i32, i32, i32, i32, i32)
DEF_HELPER_FLAGS_5(mulm_h, TCG_CALL_NO_RWG_SE, i64, i32, i32, i32, i32, i32)
diff --git a/target-tricore/op_helper.c b/target-tricore/op_helper.c
index 13e2729..7047b7c 100644
--- a/target-tricore/op_helper.c
+++ b/target-tricore/op_helper.c
@@ -867,6 +867,50 @@ uint32_t helper_##name ##_hu(target_ulong r1, target_ulong r2)\
\
return ret; \
} \
+ \
+uint64_t helper_ix##name(uint64_t r1, uint32_t r2) \
+{ \
+ int64_t r2l, r2h, r1hl; \
+ uint64_t ret = 0; \
+ \
+ ret = ((r1 + 2) & 0xffff); \
+ r2l = sextract64(r2, 0, 16); \
+ r2h = sextract64(r2, 16, 16); \
+ r1hl = sextract64(r1, 32, 16); \
+ \
+ if ((r2l op ## = r2h) && (r2l op r1hl)) { \
+ ret |= (r2l & 0xffff) << 32; \
+ ret |= extract64(r1, 0, 16) << 16; \
+ } else if ((r2h op r2l) && (r2h op r1hl)) { \
+ ret |= extract64(r2, 16, 16) << 32; \
+ ret |= extract64(r1 + 1, 0, 16) << 16; \
+ } else { \
+ ret |= r1 & 0xffffffff0000ull; \
+ } \
+ return ret; \
+} \
+ \
+uint64_t helper_ix##name ##_u(uint64_t r1, uint32_t r2) \
+{ \
+ int64_t r2l, r2h, r1hl; \
+ uint64_t ret = 0; \
+ \
+ ret = ((r1 + 2) & 0xffff); \
+ r2l = extract64(r2, 0, 16); \
+ r2h = extract64(r2, 16, 16); \
+ r1hl = extract64(r1, 32, 16); \
+ \
+ if ((r2l op ## = r2h) && (r2l op r1hl)) { \
+ ret |= (r2l & 0xffff) << 32; \
+ ret |= extract64(r1, 0, 16) << 16; \
+ } else if ((r2h op r2l) && (r2h op r1hl)) { \
+ ret |= extract64(r2, 16, 16) << 32; \
+ ret |= extract64(r1 + 1, 0, 16) << 16; \
+ } else { \
+ ret |= r1 & 0xffffffff0000ull; \
+ } \
+ return ret; \
+}
EXTREMA_H_B(max, >)
EXTREMA_H_B(min, <)
@@ -1100,6 +1144,48 @@ uint32_t helper_parity(target_ulong r1)
return ret;
}
+uint32_t helper_pack(uint32_t carry, uint32_t r1_low, uint32_t r1_high,
+ target_ulong r2)
+{
+ uint32_t ret;
+ int32_t fp_exp, fp_frac, temp_exp, fp_exp_frac;
+ int32_t int_exp = r1_high;
+ int32_t int_mant = r1_low;
+ uint32_t flag_rnd = (int_mant & (1 << 7)) && (
+ (int_mant & (1 << 8)) ||
+ (int_mant & 0x7f) ||
+ (carry != 0));
+ if (((int_mant & (1<<31)) == 0) && (int_exp == 255)) {
+ fp_exp = 255;
+ fp_frac = extract32(int_mant, 8, 23);
+ } else if ((int_mant & (1<<31)) && (int_exp >= 127)) {
+ fp_exp = 255;
+ fp_frac = 0;
+ } else if ((int_mant & (1<<31)) && (int_exp <= -128)) {
+ fp_exp = 0;
+ fp_frac = 0;
+ } else if (int_mant == 0) {
+ fp_exp = 0;
+ fp_frac = 0;
+ } else {
+ if (((int_mant & (1 << 31)) == 0)) {
+ temp_exp = 0;
+ } else {
+ temp_exp = int_exp + 128;
+ }
+ fp_exp_frac = (((temp_exp & 0xff) << 23) |
+ extract32(int_mant, 8, 23))
+ + flag_rnd;
+ fp_exp = extract32(fp_exp_frac, 23, 8);
+ fp_frac = extract32(fp_exp_frac, 0, 23);
+ }
+ ret = r2 & (1 << 31);
+ ret = ret + (fp_exp << 23);
+ ret = ret + (fp_frac & 0x7fffff);
+
+ return ret;
+}
+
uint64_t helper_unpack(target_ulong arg1)
{
int32_t fp_exp = extract32(arg1, 23, 8);
@@ -1228,6 +1314,80 @@ uint64_t helper_dvinit_h_131(CPUTriCoreState *env, uint32_t r1, uint32_t r2)
return ret;
}
+uint64_t helper_dvadj(uint64_t r1, uint32_t r2)
+{
+ int32_t x_sign = (r1 >> 63);
+ int32_t q_sign = x_sign ^ (r2 >> 31);
+ int32_t eq_pos = x_sign & ((r1 >> 32) == r2);
+ int32_t eq_neg = x_sign & ((r1 >> 32) == -r2);
+ uint32_t quotient;
+ uint64_t ret, remainder;
+
+ if ((q_sign & ~eq_neg) | eq_pos) {
+ quotient = (r1 + 1) & 0xffffffff;
+ } else {
+ quotient = r1 & 0xffffffff;
+ }
+
+ if (eq_pos | eq_neg) {
+ remainder = 0;
+ } else {
+ remainder = (r1 & 0xffffffff00000000ull);
+ }
+ ret = remainder|quotient;
+ return ret;
+}
+
+uint64_t helper_dvstep(uint64_t r1, uint32_t r2)
+{
+ int32_t dividend_sign = extract64(r1, 63, 1);
+ int32_t divisor_sign = extract32(r2, 31, 1);
+ int32_t quotient_sign = (dividend_sign != divisor_sign);
+ int32_t addend, dividend_quotient, remainder;
+ int32_t i, temp;
+
+ if (quotient_sign) {
+ addend = r2;
+ } else {
+ addend = -r2;
+ }
+ dividend_quotient = (int32_t)r1;
+ remainder = (int32_t)(r1 >> 32);
+
+ for (i = 0; i < 8; i++) {
+ remainder = (remainder << 1) | extract32(dividend_quotient, 31, 1);
+ dividend_quotient <<= 1;
+ temp = remainder + addend;
+ if ((temp < 0) == dividend_sign) {
+ remainder = temp;
+ }
+ if (((temp < 0) == dividend_sign)) {
+ dividend_quotient = dividend_quotient | !quotient_sign;
+ } else {
+ dividend_quotient = dividend_quotient | quotient_sign;
+ }
+ }
+ return ((uint64_t)remainder << 32) | (uint32_t)dividend_quotient;
+}
+
+uint64_t helper_dvstep_u(uint64_t r1, uint32_t r2)
+{
+ int32_t dividend_quotient = extract64(r1, 0, 32);
+ int64_t remainder = extract64(r1, 32, 32);
+ int32_t i;
+ int64_t temp;
+ for (i = 0; i < 8; i++) {
+ remainder = (remainder << 1) | extract32(dividend_quotient, 31, 1);
+ dividend_quotient <<= 1;
+ temp = (remainder & 0xffffffff) - r2;
+ if (temp >= 0) {
+ remainder = temp;
+ }
+ dividend_quotient = dividend_quotient | !(temp < 0);
+ }
+ return ((uint64_t)remainder << 32) | (uint32_t)dividend_quotient;
+}
+
uint64_t helper_mul_h(uint32_t arg00, uint32_t arg01,
uint32_t arg10, uint32_t arg11, uint32_t n)
{
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index aa70f61..34cec41 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -182,6 +182,18 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f,
tcg_temp_free(arg11); \
} while (0)
+#define GEN_HELPER_RRR(name, rl, rh, al1, ah1, arg2) do { \
+ TCGv_i64 ret = tcg_temp_new_i64(); \
+ TCGv_i64 arg1 = tcg_temp_new_i64(); \
+ \
+ tcg_gen_concat_i32_i64(arg1, al1, ah1); \
+ gen_helper_##name(ret, arg1, arg2); \
+ tcg_gen_extr_i64_i32(rl, rh, ret); \
+ \
+ tcg_temp_free_i64(ret); \
+ tcg_temp_free_i64(arg1); \
+} while (0)
+
#define EA_ABS_FORMAT(con) (((con & 0x3C000) << 14) + (con & 0x3FFF))
#define EA_B_ABSOLUT(con) (((offset & 0xf00000) << 8) | \
((offset & 0x0fffff) << 1))
@@ -820,6 +832,45 @@ static inline void gen_subc_CC(TCGv ret, TCGv r1, TCGv r2)
tcg_temp_free(temp);
}
+static inline void gen_cond_sub(TCGCond cond, TCGv r1, TCGv r2, TCGv r3,
+ TCGv r4)
+{
+ TCGv temp = tcg_temp_new();
+ TCGv temp2 = tcg_temp_new();
+ TCGv result = tcg_temp_new();
+ TCGv mask = tcg_temp_new();
+ TCGv t0 = tcg_const_i32(0);
+
+ /* create mask for sticky bits */
+ tcg_gen_setcond_tl(cond, mask, r4, t0);
+ tcg_gen_shli_tl(mask, mask, 31);
+
+ tcg_gen_sub_tl(result, r1, r2);
+ /* Calc PSW_V */
+ tcg_gen_xor_tl(temp, result, r1);
+ tcg_gen_xor_tl(temp2, r1, r2);
+ tcg_gen_and_tl(temp, temp, temp2);
+ tcg_gen_movcond_tl(cond, cpu_PSW_V, r4, t0, temp, cpu_PSW_V);
+ /* Set PSW_SV */
+ tcg_gen_and_tl(temp, temp, mask);
+ tcg_gen_or_tl(cpu_PSW_SV, temp, cpu_PSW_SV);
+ /* calc AV bit */
+ tcg_gen_add_tl(temp, result, result);
+ tcg_gen_xor_tl(temp, temp, result);
+ tcg_gen_movcond_tl(cond, cpu_PSW_AV, r4, t0, temp, cpu_PSW_AV);
+ /* calc SAV bit */
+ tcg_gen_and_tl(temp, temp, mask);
+ tcg_gen_or_tl(cpu_PSW_SAV, temp, cpu_PSW_SAV);
+ /* write back result */
+ tcg_gen_movcond_tl(cond, r3, r4, t0, result, r1);
+
+ tcg_temp_free(t0);
+ tcg_temp_free(temp);
+ tcg_temp_free(temp2);
+ tcg_temp_free(result);
+ tcg_temp_free(mask);
+}
+
static inline void gen_abs(TCGv ret, TCGv r1)
{
TCGv temp = tcg_temp_new();
@@ -5041,6 +5092,99 @@ static void decode_rrpw_extract_insert(CPUTriCoreState *env, DisasContext *ctx)
}
}
+/* RRR format */
+static void decode_rrr_cond_select(CPUTriCoreState *env, DisasContext *ctx)
+{
+ uint32_t op2;
+ int r1, r2, r3, r4;
+ TCGv temp;
+
+ op2 = MASK_OP_RRR_OP2(ctx->opcode);
+ r1 = MASK_OP_RRR_S1(ctx->opcode);
+ r2 = MASK_OP_RRR_S2(ctx->opcode);
+ r3 = MASK_OP_RRR_S3(ctx->opcode);
+ r4 = MASK_OP_RRR_D(ctx->opcode);
+
+ switch (op2) {
+ case OPC2_32_RRR_CADD:
+ gen_cond_add(TCG_COND_NE, cpu_gpr_d[r1], cpu_gpr_d[r2],
+ cpu_gpr_d[r4], cpu_gpr_d[r3]);
+ break;
+ case OPC2_32_RRR_CADDN:
+ gen_cond_add(TCG_COND_EQ, cpu_gpr_d[r1], cpu_gpr_d[r2], cpu_gpr_d[r4],
+ cpu_gpr_d[r3]);
+ break;
+ case OPC2_32_RRR_CSUB:
+ gen_cond_sub(TCG_COND_NE, cpu_gpr_d[r1], cpu_gpr_d[r2], cpu_gpr_d[r4],
+ cpu_gpr_d[r3]);
+ break;
+ case OPC2_32_RRR_CSUBN:
+ gen_cond_sub(TCG_COND_EQ, cpu_gpr_d[r1], cpu_gpr_d[r2], cpu_gpr_d[r4],
+ cpu_gpr_d[r3]);
+ break;
+ case OPC2_32_RRR_SEL:
+ temp = tcg_const_i32(0);
+ tcg_gen_movcond_tl(TCG_COND_NE, cpu_gpr_d[r4], cpu_gpr_d[r3], temp,
+ cpu_gpr_d[r1], cpu_gpr_d[r2]);
+ tcg_temp_free(temp);
+ break;
+ case OPC2_32_RRR_SELN:
+ temp = tcg_const_i32(0);
+ tcg_gen_movcond_tl(TCG_COND_EQ, cpu_gpr_d[r4], cpu_gpr_d[r3], temp,
+ cpu_gpr_d[r1], cpu_gpr_d[r2]);
+ tcg_temp_free(temp);
+ break;
+ }
+}
+
+static void decode_rrr_divide(CPUTriCoreState *env, DisasContext *ctx)
+{
+ uint32_t op2;
+
+ int r1, r2, r3, r4;
+
+ op2 = MASK_OP_RRR_OP2(ctx->opcode);
+ r1 = MASK_OP_RRR_S1(ctx->opcode);
+ r2 = MASK_OP_RRR_S2(ctx->opcode);
+ r3 = MASK_OP_RRR_S3(ctx->opcode);
+ r4 = MASK_OP_RRR_D(ctx->opcode);
+
+ switch (op2) {
+ case OPC2_32_RRR_DVADJ:
+ GEN_HELPER_RRR(dvadj, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_DVSTEP:
+ GEN_HELPER_RRR(dvstep, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_DVSTEP_U:
+ GEN_HELPER_RRR(dvstep_u, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_IXMAX:
+ GEN_HELPER_RRR(ixmax, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_IXMAX_U:
+ GEN_HELPER_RRR(ixmax_u, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_IXMIN:
+ GEN_HELPER_RRR(ixmin, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_IXMIN_U:
+ GEN_HELPER_RRR(ixmin_u, cpu_gpr_d[r4], cpu_gpr_d[r4+1], cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r2]);
+ break;
+ case OPC2_32_RRR_PACK:
+ gen_helper_pack(cpu_gpr_d[r4], cpu_PSW_C, cpu_gpr_d[r3],
+ cpu_gpr_d[r3+1], cpu_gpr_d[r1]);
+ break;
+ }
+}
+
static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
{
int op1;
@@ -5324,6 +5468,12 @@ static void decode_32Bit_opc(CPUTriCoreState *env, DisasContext *ctx)
tcg_temp_free(temp);
}
break;
+/* RRR Format */
+ case OPCM_32_RRR_COND_SELECT:
+ decode_rrr_cond_select(env, ctx);
+ break;
+ case OPCM_32_RRR_DIVIDE:
+ decode_rrr_divide(env, ctx);
}
}
diff --git a/target-tricore/tricore-opcodes.h b/target-tricore/tricore-opcodes.h
index 82bd161..baf537f 100644
--- a/target-tricore/tricore-opcodes.h
+++ b/target-tricore/tricore-opcodes.h
@@ -516,7 +516,7 @@ enum {
OPC1_32_RRPW_DEXTR = 0x77,
/* RRR Format */
OPCM_32_RRR_COND_SELECT = 0x2b,
- OPCM_32_RRR_FLOAT = 0x6b,
+ OPCM_32_RRR_DIVIDE = 0x6b,
/* RRR1 Format */
OPCM_32_RRR1_MADD = 0x83,
OPCM_32_RRR1_MADDQ_H = 0x43,
--
2.2.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format
2015-01-26 16:29 [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format Bastian Koppelmann
` (3 preceding siblings ...)
2015-01-26 16:30 ` [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RRR " Bastian Koppelmann
@ 2015-01-26 18:11 ` Richard Henderson
4 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2015-01-26 18:11 UTC (permalink / raw)
To: Bastian Koppelmann, qemu-devel
On 01/26/2015 08:29 AM, Bastian Koppelmann wrote:
> v1 -> v2:
> - Add 3 helper functions (gen_mul_q, gen_mul_q_16, gen_mulr_q) to
> remove repetition.
> - gen_mul_q now uses 64 arithmetic instead of emulating it.
> - MUL_Q now uses arithmetic shift, instead of normal shift + sign extend for arg
> extraction.
> - optimize OPC2_32_RRPW_EXTR by using only two shifts, instead of four.
> - OPC1_32_RRPW_DEXTR now has r1 == r2 as a special case.
>
> Bastian Koppelmann (4):
> target-tricore: target-tricore: Add instructions of RR1 opcode format,
> that have 0x93 as first opcode
> target-tricore: Add instructions of RR2 opcode format
> target-tricore: Add instructions of RRPW opcode format
> target-tricore: Add instructions of RRR opcode format
>
> target-tricore/helper.h | 8 +
> target-tricore/op_helper.c | 160 ++++++++++++++
> target-tricore/translate.c | 439 +++++++++++++++++++++++++++++++++++++++
> target-tricore/tricore-opcodes.h | 2 +-
> 4 files changed, 608 insertions(+), 1 deletion(-)
Reviewed-by: Richard Henderson <rth@twiddle.net>
r~
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-01-26 18:11 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-26 16:29 [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and RRR opcode format Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 1/4] target-tricore: target-tricore: Add instructions of RR1 opcode format, that have 0x93 as first opcode Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 2/4] target-tricore: Add instructions of RR2 opcode format Bastian Koppelmann
2015-01-26 16:29 ` [Qemu-devel] [PATCH v2 3/4] target-tricore: Add instructions of RRPW " Bastian Koppelmann
2015-01-26 16:30 ` [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RRR " Bastian Koppelmann
2015-01-26 18:11 ` [Qemu-devel] [PATCH v2 0/4] TriCore add instructions of RR1, RR2, RRPW and " Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).