From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NVS4k-00029C-3S for qemu-devel@nongnu.org; Thu, 14 Jan 2010 10:57:34 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NVS4j-00028b-3t for qemu-devel@nongnu.org; Thu, 14 Jan 2010 10:57:33 -0500 Received: from [199.232.76.173] (port=51609 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NVS4i-00028I-KG for qemu-devel@nongnu.org; Thu, 14 Jan 2010 10:57:32 -0500 Received: from mx20.gnu.org ([199.232.41.8]:21568) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1NVS4h-0000YT-L1 for qemu-devel@nongnu.org; Thu, 14 Jan 2010 10:57:32 -0500 Received: from hall.aurel32.net ([88.191.82.174]) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NVS4g-0003z5-32 for qemu-devel@nongnu.org; Thu, 14 Jan 2010 10:57:30 -0500 Date: Thu, 14 Jan 2010 16:57:27 +0100 From: Aurelien Jarno Subject: Re: [Qemu-devel] [PATCH 1/2] tcg-x86_64: Special-case all 32-bit AND operands. Message-ID: <20100114155727.GE16630@volta.aurel32.net> References: <20100106010536.9092BCBA@are.twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20100106010536.9092BCBA@are.twiddle.net> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: laurent.desnogues@gmail.com, qemu-devel@nongnu.org On Tue, Jan 05, 2010 at 04:03:00PM -0800, Richard Henderson wrote: > This avoids an unnecessary REX.W prefix when dealing with AND > operands that fit into a 32-bit quantity. The most common change > actually seen is movz[wb]q -> movz[wb]l. > > Similarly, avoid REXW in ext{8,16}u_i64 tcg opcodes. > > Signed-off-by: Richard Henderson > --- > tcg/x86_64/tcg-target.c | 26 ++++++++------------------ > 1 files changed, 8 insertions(+), 18 deletions(-) > > diff --git a/tcg/x86_64/tcg-target.c b/tcg/x86_64/tcg-target.c > index 2339091..f584c94 100644 > --- a/tcg/x86_64/tcg-target.c > +++ b/tcg/x86_64/tcg-target.c > @@ -426,24 +426,18 @@ static inline void tgen_arithi64(TCGContext *s, int c, int r0, int64_t val) > } else if ((c == ARITH_ADD && val == -1) || (c == ARITH_SUB && val == 1)) { > /* dec */ > tcg_out_modrm(s, 0xff | P_REXW, 1, r0); > - } else if (val == (int8_t)val) { > - tcg_out_modrm(s, 0x83 | P_REXW, c, r0); > - tcg_out8(s, val); > - } else if (c == ARITH_AND && val == 0xffu) { > - /* movzbl */ > - tcg_out_modrm(s, 0xb6 | P_EXT | P_REXW, r0, r0); > - } else if (c == ARITH_AND && val == 0xffffu) { > - /* movzwl */ > - tcg_out_modrm(s, 0xb7 | P_EXT | P_REXW, r0, r0); > } else if (c == ARITH_AND && val == 0xffffffffu) { > /* 32-bit mov zero extends */ > tcg_out_modrm(s, 0x8b, r0, r0); > + } else if (c == ARITH_AND && (uint64_t)val <= 0xffffffffu) { > + /* AND with no high bits set can use a 32-bit operation. */ > + tgen_arithi32(s, c, r0, val); Do we really want to call tgen_arithi32() here, that will redo part of the above tests again? It might be better to simply remove the REX.W prefix above instead. > + } else if (val == (int8_t)val) { > + tcg_out_modrm(s, 0x83 | P_REXW, c, r0); > + tcg_out8(s, val); > } else if (val == (int32_t)val) { > tcg_out_modrm(s, 0x81 | P_REXW, c, r0); > tcg_out32(s, val); > - } else if (c == ARITH_AND && val == (uint32_t)val) { > - tcg_out_modrm(s, 0x81, c, r0); > - tcg_out32(s, val); > } else { > tcg_abort(); > } > @@ -1182,16 +1176,12 @@ static inline void tcg_out_op(TCGContext *s, int opc, const TCGArg *args, > tcg_out_modrm(s, 0x63 | P_REXW, args[0], args[1]); > break; > case INDEX_op_ext8u_i32: > + case INDEX_op_ext8u_i64: > tcg_out_modrm(s, 0xb6 | P_EXT | P_REXB, args[0], args[1]); > break; > case INDEX_op_ext16u_i32: > - tcg_out_modrm(s, 0xb7 | P_EXT, args[0], args[1]); > - break; > - case INDEX_op_ext8u_i64: > - tcg_out_modrm(s, 0xb6 | P_EXT | P_REXW, args[0], args[1]); > - break; > case INDEX_op_ext16u_i64: > - tcg_out_modrm(s, 0xb7 | P_EXT | P_REXW, args[0], args[1]); > + tcg_out_modrm(s, 0xb7 | P_EXT, args[0], args[1]); > break; > case INDEX_op_ext32u_i64: > tcg_out_modrm(s, 0x8b, args[0], args[1]); This part looks fine. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net