From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NLbAg-0004RC-7v for qemu-devel@nongnu.org; Fri, 18 Dec 2009 06:38:58 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NLbAb-0004NA-3d for qemu-devel@nongnu.org; Fri, 18 Dec 2009 06:38:57 -0500 Received: from [199.232.76.173] (port=46957 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NLbAa-0004N5-8o for qemu-devel@nongnu.org; Fri, 18 Dec 2009 06:38:52 -0500 Received: from mail-px0-f189.google.com ([209.85.216.189]:42783) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NLbAW-0007NM-Su for qemu-devel@nongnu.org; Fri, 18 Dec 2009 06:38:50 -0500 Received: by pxi27 with SMTP id 27so363281pxi.4 for ; Fri, 18 Dec 2009 03:38:45 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <761ea48b0912170620l534dcb02m8ea6b59524d76dbe@mail.gmail.com> Date: Fri, 18 Dec 2009 12:38:45 +0100 Message-ID: <761ea48b0912180338l340b5665t217ff2af1b9e87fb@mail.gmail.com> From: Laurent Desnogues Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [PATCH 1/6] tcg: Generic support for conditional set and conditional move. List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: qemu-devel@nongnu.org On Thu, Dec 17, 2009 at 6:27 PM, Richard Henderson wrote: > Defines setcond and movcond for implementing conditional moves at > the tcg opcode level. =A064-bit-on-32-bit is expanded via a setcond2 > primitive plus other operations. > > Signed-off-by: Richard Henderson > --- > =A0tcg/README =A0 =A0| =A0 26 +++++++++++++++- > =A0tcg/tcg-op.h =A0| =A0 91 +++++++++++++++++++++++++++++++++++++++++++++= ++++++++++++ > =A0tcg/tcg-opc.h | =A0 =A05 +++ > =A0tcg/tcg.c =A0 =A0 | =A0 23 ++++++++++---- > =A04 files changed, 138 insertions(+), 7 deletions(-) > > diff --git a/tcg/README b/tcg/README > index e672258..8617994 100644 > --- a/tcg/README > +++ b/tcg/README > @@ -152,6 +152,11 @@ Conditional jump if t0 cond t1 is true. cond can be: > =A0 =A0 TCG_COND_LEU /* unsigned */ > =A0 =A0 TCG_COND_GTU /* unsigned */ > > +* brcond2_i32 cond, t0_low, t0_high, t1_low, t1_high, label > + > +Similar to brcond, except that the 64-bit values T0 and T1 > +are formed from two 32-bit arguments. > + > =A0********* Arithmetic > > =A0* add_i32/i64 t0, t1, t2 > @@ -282,6 +287,25 @@ order bytes must be set to zero. > =A0Indicate that the value of t0 won't be used later. It is useful to > =A0force dead code elimination. > > +********* Conditional moves > + > +* setcond_i32/i64 cond, dest, t1, t2 > + > +dest =3D (t1 cond t2) > + > +Set DEST to 1 if (T1 cond T2) is true, otherwise set to 0. > + > +* movcond_i32/i64 cond, dest, c1, c2, vtrue, vfalse > + > +dest=3D (c1 cond c2 ? vtrue : of) As malc already wrote this should be: dest =3D (c1 cond c2 ? vtrue : vfalse) > + > +Set DEST to VTRUE if (c1 cond c2) is true, otherwise set to VFALSE. > + > +* setcond2_i32 cond, dest, t1_low, t1_high, t2_low, t2_high > + > +Similar to setcond, except that the 64-bit values T1 and T2 are > +formed from two 32-bit arguments. =A0The result is a 32-bit value. > + > =A0********* Type conversions > > =A0* ext_i32_i64 t0, t1 > @@ -375,7 +399,7 @@ The target word size (TCG_TARGET_REG_BITS) is expecte= d to be 32 bit or > > =A0On a 32 bit target, all 64 bit operations are converted to 32 bits. A > =A0few specific operations must be implemented to allow it (see add2_i32, > -sub2_i32, brcond2_i32). > +sub2_i32, brcond2_i32, setcond2_i32). > > =A0Floating point operations are not supported in this version. A > =A0previous incarnation of the code generator had full support of them, > diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h > index faf2e8b..f43ed16 100644 > --- a/tcg/tcg-op.h > +++ b/tcg/tcg-op.h > @@ -280,6 +280,32 @@ static inline void tcg_gen_op6_i64(int opc, TCGv_i64= arg1, TCGv_i64 arg2, > =A0 =A0 *gen_opparam_ptr++ =3D GET_TCGV_I64(arg6); > =A0} > > +static inline void tcg_gen_op6i_i32(int opc, TCGv_i32 arg1, TCGv_i32 arg= 2, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= TCGv_i32 arg3, TCGv_i32 arg4, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= TCGv_i32 arg5, TCGArg arg6) > +{ > + =A0 =A0*gen_opc_ptr++ =3D opc; > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I32(arg1); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I32(arg2); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I32(arg3); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I32(arg4); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I32(arg5); > + =A0 =A0*gen_opparam_ptr++ =3D arg6; > +} > + > +static inline void tcg_gen_op6i_i64(int opc, TCGv_i64 arg1, TCGv_i64 arg= 2, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= TCGv_i64 arg3, TCGv_i64 arg4, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= TCGv_i64 arg5, TCGArg arg6) > +{ > + =A0 =A0*gen_opc_ptr++ =3D opc; > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I64(arg1); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I64(arg2); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I64(arg3); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I64(arg4); > + =A0 =A0*gen_opparam_ptr++ =3D GET_TCGV_I64(arg5); > + =A0 =A0*gen_opparam_ptr++ =3D arg6; > +} > + > =A0static inline void tcg_gen_op6ii_i32(int opc, TCGv_i32 arg1, TCGv_i32 = arg2, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0TCGv_i32 arg3, TCGv_i32 arg4, TCGArg arg5, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0TCGArg arg6) > @@ -1795,6 +1821,67 @@ static inline void tcg_gen_rotri_i64(TCGv_i64 ret,= TCGv_i64 arg1, int64_t arg2) > =A0 =A0 } > =A0} > > +static inline void tcg_gen_setcond_i32(int cond, TCGv_i32 ret, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 TCGv_i32 arg1, TCGv_i32 arg2) > +{ > + =A0 =A0tcg_gen_op4i_i32(INDEX_op_setcond_i32, ret, arg1, arg2, cond); > +} > + > +static inline void tcg_gen_setcond_i64(int cond, TCGv_i64 ret, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 TCGv_i64 arg1, TCGv_i64 arg2) > +{ > +#if TCG_TARGET_REG_BITS =3D=3D 64 > + =A0 =A0tcg_gen_op4i_i64(INDEX_op_setcond_i64, ret, arg1, arg2, cond); > +#else > + =A0 =A0tcg_gen_op6i_i32(INDEX_op_setcond2_i32, TCGV_LOW(ret), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 TCGV_LOW(arg1), TCGV_HIGH(arg1)= , > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 TCGV_LOW(arg2), TCGV_HIGH(arg2)= , cond); > + =A0 =A0tcg_gen_movi_i32(TCGV_HIGH(ret), 0); > +#endif > +} > + > +static inline void tcg_gen_movcond_i32(int cond, TCGv_i32 ret, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 TCGv_i32 cmp1, TCGv_i32 cmp2, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 TCGv_i32 op_t, TCGv_i32 op_f) > +{ > + =A0 =A0if (TCGV_EQUAL_I32(op_t, op_f)) { > + =A0 =A0 =A0 =A0tcg_gen_mov_i32(ret, op_t); > + =A0 =A0 =A0 =A0return; > + =A0 =A0} > + =A0 =A0tcg_gen_op6i_i32(INDEX_op_movcond_i32, ret, cmp1, cmp2, op_t, op= _f, cond); > +} > + > +static inline void tcg_gen_movcond_i64(int cond, TCGv_i64 ret, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 TCGv_i64 cmp1, TCGv_i64 cmp2, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 TCGv_i64 op_t, TCGv_i64 op_f) > +{ > + =A0 =A0if (TCGV_EQUAL_I64(op_t, op_f)) { > + =A0 =A0 =A0 =A0tcg_gen_mov_i64(ret, op_t); > + =A0 =A0 =A0 =A0return; > + =A0 =A0} > +#if TCG_TARGET_REG_BITS =3D=3D 64 > + =A0 =A0tcg_gen_op6i_i64(INDEX_op_movcond_i64, ret, cmp1, cmp2, op_t, op= _f, cond); > +#else > + =A0 =A0{ > + =A0 =A0 =A0 =A0TCGv_i32 t0 =3D tcg_temp_new_i32(); > + =A0 =A0 =A0 =A0TCGv_i32 zero =3D tcg_const_i32(0); > + > + =A0 =A0 =A0 =A0tcg_gen_op6i_i32(INDEX_op_setcond2_i32, t0, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 TCGV_LOW(cmp1), TCGV_HI= GH(cmp1), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 TCGV_LOW(cmp2), TCGV_HI= GH(cmp2), cond); > + > + =A0 =A0 =A0 =A0/* ??? We could perhaps conditionally define a movcond2_= i32. =A0*/ > + =A0 =A0 =A0 =A0tcg_gen_movcond_i32(TCG_COND_NE, TCGV_LOW(ret), t0, zero= , > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TCGV_LOW(op_t), = TCGV_LOW(op_f)); > + =A0 =A0 =A0 =A0tcg_gen_movcond_i32(TCG_COND_NE, TCGV_HIGH(ret), t0, zer= o, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TCGV_HIGH(op_t),= TCGV_HIGH(op_f)); > + > + =A0 =A0 =A0 =A0tcg_temp_free_i32(t0); > + =A0 =A0 =A0 =A0tcg_temp_free_i32(zero); > + =A0 =A0} > +#endif I agree movcond2 would be handy (though it can be argued that anyway the speed of a 64-bit guest on a 32-bit host, where it would matter the most, is low anyway). I think it would also be nice to have to have a movtrue helper that'd simply be movcond cond, dest, c1, c2, vtrue, dest. All that can wait. > +} > + > =A0/***************************************/ > =A0/* QEMU specific operations. Their type depend on the QEMU CPU > =A0 =A0type. */ > @@ -2067,6 +2154,8 @@ static inline void tcg_gen_qemu_st64(TCGv_i64 arg, = TCGv addr, int mem_index) > =A0#define tcg_gen_sari_tl tcg_gen_sari_i64 > =A0#define tcg_gen_brcond_tl tcg_gen_brcond_i64 > =A0#define tcg_gen_brcondi_tl tcg_gen_brcondi_i64 > +#define tcg_gen_setcond_tl tcg_gen_setcond_i64 > +#define tcg_gen_movcond_tl tcg_gen_movcond_i64 > =A0#define tcg_gen_mul_tl tcg_gen_mul_i64 > =A0#define tcg_gen_muli_tl tcg_gen_muli_i64 > =A0#define tcg_gen_div_tl tcg_gen_div_i64 > @@ -2137,6 +2226,8 @@ static inline void tcg_gen_qemu_st64(TCGv_i64 arg, = TCGv addr, int mem_index) > =A0#define tcg_gen_sari_tl tcg_gen_sari_i32 > =A0#define tcg_gen_brcond_tl tcg_gen_brcond_i32 > =A0#define tcg_gen_brcondi_tl tcg_gen_brcondi_i32 > +#define tcg_gen_setcond_tl tcg_gen_setcond_i32 > +#define tcg_gen_movcond_tl tcg_gen_movcond_i32 > =A0#define tcg_gen_mul_tl tcg_gen_mul_i32 > =A0#define tcg_gen_muli_tl tcg_gen_muli_i32 > =A0#define tcg_gen_div_tl tcg_gen_div_i32 > diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h > index b7f3fd7..086968c 100644 > --- a/tcg/tcg-opc.h > +++ b/tcg/tcg-opc.h > @@ -42,6 +42,8 @@ DEF2(br, 0, 0, 1, TCG_OPF_BB_END | TCG_OPF_SIDE_EFFECTS= ) > > =A0DEF2(mov_i32, 1, 1, 0, 0) > =A0DEF2(movi_i32, 1, 0, 1, 0) > +DEF2(setcond_i32, 1, 2, 1, 0) > +DEF2(movcond_i32, 1, 4, 1, 0) > =A0/* load/store */ > =A0DEF2(ld8u_i32, 1, 1, 1, 0) > =A0DEF2(ld8s_i32, 1, 1, 1, 0) > @@ -82,6 +84,7 @@ DEF2(add2_i32, 2, 4, 0, 0) > =A0DEF2(sub2_i32, 2, 4, 0, 0) > =A0DEF2(brcond2_i32, 0, 4, 2, TCG_OPF_BB_END | TCG_OPF_SIDE_EFFECTS) > =A0DEF2(mulu2_i32, 2, 2, 0, 0) > +DEF2(setcond2_i32, 1, 4, 1, 0) > =A0#endif > =A0#ifdef TCG_TARGET_HAS_ext8s_i32 > =A0DEF2(ext8s_i32, 1, 1, 0, 0) > @@ -111,6 +114,8 @@ DEF2(neg_i32, 1, 1, 0, 0) > =A0#if TCG_TARGET_REG_BITS =3D=3D 64 > =A0DEF2(mov_i64, 1, 1, 0, 0) > =A0DEF2(movi_i64, 1, 0, 1, 0) > +DEF2(setcond_i64, 1, 2, 1, 0) > +DEF2(movcond_i64, 1, 4, 1, 0) > =A0/* load/store */ > =A0DEF2(ld8u_i64, 1, 1, 1, 0) > =A0DEF2(ld8s_i64, 1, 1, 1, 0) > diff --git a/tcg/tcg.c b/tcg/tcg.c > index 3c0e296..f7ea727 100644 > --- a/tcg/tcg.c > +++ b/tcg/tcg.c > @@ -670,6 +670,7 @@ void tcg_gen_shifti_i64(TCGv_i64 ret, TCGv_i64 arg1, > =A0} > =A0#endif > > + Was this really needed? :-) Laurent > =A0static void tcg_reg_alloc_start(TCGContext *s) > =A0{ > =A0 =A0 int i; > @@ -888,21 +889,31 @@ void tcg_dump_ops(TCGContext *s, FILE *outfile) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fprintf(outfile, "%s", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tcg_get_arg_str_idx(s, bu= f, sizeof(buf), args[k++])); > =A0 =A0 =A0 =A0 =A0 =A0 } > - =A0 =A0 =A0 =A0 =A0 =A0if (c =3D=3D INDEX_op_brcond_i32 > + =A0 =A0 =A0 =A0 =A0 =A0switch (c) { > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_brcond_i32: > +#if TCG_TARGET_REG_BITS =3D=3D 32 > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_brcond2_i32: > +#elif TCG_TARGET_REG_BITS =3D=3D 64 > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_brcond_i64: > +#endif > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_setcond_i32: > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_movcond_i32: > =A0#if TCG_TARGET_REG_BITS =3D=3D 32 > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0|| c =3D=3D INDEX_op_brcond2_i32 > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_setcond2_i32: > =A0#elif TCG_TARGET_REG_BITS =3D=3D 64 > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0|| c =3D=3D INDEX_op_brcond_i64 > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_setcond_i64: > + =A0 =A0 =A0 =A0 =A0 =A0case INDEX_op_movcond_i64: > =A0#endif > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (args[k] < ARRAY_SIZE(cond_name) && co= nd_name[args[k]]) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fprintf(outfile, ",%s", cond_name= [args[k++]]); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 else > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fprintf(outfile, ",$0x%" TCG_PRIl= x, args[k++]); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i =3D 1; > - =A0 =A0 =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 =A0 =A0 =A0else > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break; > + =A0 =A0 =A0 =A0 =A0 =A0default: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i =3D 0; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break; > + =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 for(; i < nb_cargs; i++) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (k !=3D 0) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fprintf(outfile, ","); > -- > 1.6.5.2 > >