From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NLIQ2-0000ec-1T for qemu-devel@nongnu.org; Thu, 17 Dec 2009 10:37:34 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NLIPx-0000cL-Cb for qemu-devel@nongnu.org; Thu, 17 Dec 2009 10:37:33 -0500 Received: from [199.232.76.173] (port=59291 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NLIPx-0000cI-52 for qemu-devel@nongnu.org; Thu, 17 Dec 2009 10:37:29 -0500 Received: from mail-px0-f189.google.com ([209.85.216.189]:56136) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NLIPw-0001Zk-Lx for qemu-devel@nongnu.org; Thu, 17 Dec 2009 10:37:29 -0500 Received: by pxi27 with SMTP id 27so1393185pxi.4 for ; Thu, 17 Dec 2009 07:37:27 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: Date: Thu, 17 Dec 2009 16:37:27 +0100 Message-ID: <761ea48b0912170737i54f25561q468a4a193114a903@mail.gmail.com> Subject: Re: [Qemu-devel] [PATCH 0/7] tcg: conditional set and move opcodes From: Laurent Desnogues Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: malc Cc: qemu-devel@nongnu.org, Richard Henderson On Thu, Dec 17, 2009 at 4:32 PM, malc wrote: [...] > > Some: > =A0a. It breaks tcg on PPC[1]: > > =A0 =A0...qemu/tcg/tcg.c:1378: tcg fatal error What a surprise :-) I can provide a similar patch for ARM (I already have one for my own implementation of setcond), but I'll wait for this patch series to stabilize first. Laurent > =A0b. Documentation for movcond has a typo, t0 is assigned not t1 > > =A0c. Historically things like that were made conditional with > =A0 =A0a generic fallback (bswap, neg, not, rot, etc) > > =A0d. Documentation for setcond2 is missing > > =A0e. There's some indentation weirdness here and there and `git am' > =A0 =A0complains about added trailing whitespace > > It would also be interesting to learn what impact adding those two > has on performance, any results? > > [..snip..] > > [1] With following i can run some i386 user tests on PPC32 (ls, > =A0 =A0openssl) > > diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c > index 07e6941..195af13 100644 > --- a/tcg/ppc/tcg-target.c > +++ b/tcg/ppc/tcg-target.c > @@ -316,6 +316,7 @@ static int tcg_target_const_match(tcg_target_long val= , > =A0#define STH =A0 =A0OPCD(44) > =A0#define STW =A0 =A0OPCD(36) > > +#define ADDIC =A0OPCD(12) > =A0#define ADDI =A0 OPCD(14) > =A0#define ADDIS =A0OPCD(15) > =A0#define ORI =A0 =A0OPCD(24) > @@ -339,6 +340,7 @@ static int tcg_target_const_match(tcg_target_long val= , > =A0#define CRANDC XO19(129) > =A0#define CRNAND XO19(225) > =A0#define CROR =A0 XO19(449) > +#define CRNOR =A0XO19( 33) > > =A0#define EXTSB =A0XO31(954) > =A0#define EXTSH =A0XO31(922) > @@ -365,6 +367,8 @@ static int tcg_target_const_match(tcg_target_long val= , > =A0#define MTSPR =A0XO31(467) > =A0#define SRAWI =A0XO31(824) > =A0#define NEG =A0 =A0XO31(104) > +#define MFCR =A0 XO31( 19) > +#define CNTLZW XO31( 26) > > =A0#define LBZX =A0 XO31( 87) > =A0#define LHZX =A0 XO31(279) > @@ -1073,6 +1077,95 @@ static void tcg_out_brcond (TCGContext *s, int con= d, > =A0 =A0 tcg_out_bc (s, tcg_to_bc[cond], label_index); > =A0} > > +static void tcg_out_setcond (TCGContext *s, int cond, TCGArg arg0, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 TCGArg arg1, TC= GArg arg2, int const_arg2) > +{ > + =A0 =A0int crop, sh; > + > + =A0 =A0switch (cond) { > + =A0 =A0case TCG_COND_EQ: > + =A0 =A0 =A0 =A0if (const_arg2) { > + =A0 =A0 =A0 =A0 =A0 =A0if ((uint16_t) arg2 =3D=3D arg2) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_out32 (s, XORI | RS (arg1) | RA (0) = | arg2); > + =A0 =A0 =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0 =A0 =A0else { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_out_movi (s, TCG_TYPE_I32, 0, arg2); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_out32 (s, XOR | SAB (arg1, 0, 0)); > + =A0 =A0 =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0else { > + =A0 =A0 =A0 =A0 =A0 =A0tcg_out32 (s, XOR | SAB (arg1, 0, arg2)); > + =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0tcg_out32 (s, CNTLZW | RS (0) | RA (0)); > + =A0 =A0 =A0 =A0tcg_out32 (s, (RLWINM > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | RA (arg0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | RS (0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | SH (27) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | MB (5) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | ME (31) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ) > + =A0 =A0 =A0 =A0 =A0 =A0); > + =A0 =A0 =A0 =A0return; > + > + =A0 =A0case TCG_COND_NE: > + =A0 =A0 =A0 =A0if (const_arg2) { > + =A0 =A0 =A0 =A0 =A0 =A0if ((uint16_t) arg2 =3D=3D arg2) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_out32 (s, XORI | RS (arg1) | RA (0) = | arg2); > + =A0 =A0 =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0 =A0 =A0else { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_out_movi (s, TCG_TYPE_I32, 0, arg2); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tcg_out32 (s, XOR | SAB (arg1, 0, 0)); > + =A0 =A0 =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0} > + =A0 =A0 =A0 =A0else { > + =A0 =A0 =A0 =A0 =A0 =A0tcg_out32 (s, XOR | SAB (arg1, 0, arg2)); > + =A0 =A0 =A0 =A0} > + > + =A0 =A0 =A0 =A0tcg_out32 (s, ADDIC | RT (arg0) | RA (0) | 0xffff); > + =A0 =A0 =A0 =A0tcg_out32 (s, SUBFE | TAB (arg0, arg0, 0)); > + =A0 =A0 =A0 =A0return; > + > + =A0 =A0case TCG_COND_LTU: > + =A0 =A0case TCG_COND_LT: > + =A0 =A0 =A0 =A0sh =3D 29; > + =A0 =A0 =A0 =A0crop =3D 0; > + =A0 =A0 =A0 =A0break; > + > + =A0 =A0case TCG_COND_GEU: > + =A0 =A0case TCG_COND_GE: > + =A0 =A0 =A0 =A0sh =3D 31; > + =A0 =A0 =A0 =A0crop =3D CRNOR | BT (7, CR_EQ) | BA (7, CR_LT) | BB (7, = CR_LT); > + =A0 =A0 =A0 =A0break; > + > + =A0 =A0case TCG_COND_LEU: > + =A0 =A0case TCG_COND_LE: > + =A0 =A0 =A0 =A0sh =3D 31; > + =A0 =A0 =A0 =A0crop =3D CRNOR | BT (7, CR_EQ) | BA (7, CR_GT) | BB (7, = CR_GT); > + =A0 =A0 =A0 =A0break; > + > + =A0 =A0case TCG_COND_GTU: > + =A0 =A0case TCG_COND_GT: > + =A0 =A0 =A0 =A0sh =3D 30; > + =A0 =A0 =A0 =A0crop =3D 0; > + =A0 =A0 =A0 =A0break; > + > + =A0 =A0default: > + =A0 =A0 =A0 =A0tcg_abort (); > + =A0 =A0} > + > + =A0 =A0tcg_out_cmp (s, cond, arg1, arg2, const_arg2, 7); > + =A0 =A0tcg_out32 (s, MFCR | RT (0)); > + =A0 =A0if (crop) tcg_out32 (s, crop); > + =A0 =A0tcg_out32 (s, (RLWINM > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | RA (arg0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | RS (0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | SH (sh) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | MB (31) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | ME (31) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ) > + =A0 =A0 =A0 =A0 =A0 =A0); > +} > + > =A0/* XXX: we implement it at the target level to avoid having to > =A0 =A0handle cross basic blocks temporaries */ > =A0static void tcg_out_brcond2 (TCGContext *s, const TCGArg *args, > @@ -1496,6 +1589,10 @@ static void tcg_out_op(TCGContext *s, int opc, con= st TCGArg *args, > =A0 =A0 =A0 =A0 tcg_out32 (s, EXTSH | RS (args[1]) | RA (args[0])); > =A0 =A0 =A0 =A0 break; > > + =A0 =A0case INDEX_op_setcond_i32: > + =A0 =A0 =A0 =A0tcg_out_setcond(s, args[3], args[0], args[1], args[2], c= onst_args[2]); > + =A0 =A0 =A0 =A0break; > + > =A0 =A0 default: > =A0 =A0 =A0 =A0 tcg_dump_ops (s, stderr); > =A0 =A0 =A0 =A0 tcg_abort (); > @@ -1544,6 +1641,8 @@ static const TCGTargetOpDef ppc_op_defs[] =3D { > > =A0 =A0 { INDEX_op_neg_i32, { "r", "r" } }, > > + =A0 =A0{ INDEX_op_setcond_i32, { "r", "r", "ri" } }, > + > =A0#if TARGET_LONG_BITS =3D=3D 32 > =A0 =A0 { INDEX_op_qemu_ld8u, { "r", "L" } }, > =A0 =A0 { INDEX_op_qemu_ld8s, { "r", "L" } }, > > -- > mailto:av1474@comtv.ru > > >