From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NVUmq-0008AB-Pj for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:16 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NVUmp-00087o-K0 for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:16 -0500 Received: from [199.232.76.173] (port=38072 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NVUmp-00087V-DB for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:15 -0500 Received: from hall.aurel32.net ([88.191.82.174]:39600) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1NVUmo-0007MX-Lu for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:15 -0500 Date: Thu, 14 Jan 2010 19:51:11 +0100 From: Aurelien Jarno Subject: Re: [Qemu-devel] [PATCH 2/2] tcg-x86_64: Avoid unnecessary REX.B prefixes. Message-ID: <20100114185111.GC23053@hall.aurel32.net> References: <20100106010537.9C9D2CBB@are.twiddle.net> <20100114161003.GF16630@volta.aurel32.net> <4B4F5DEC.9090909@twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <4B4F5DEC.9090909@twiddle.net> Sender: Aurelien Jarno List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: laurent.desnogues@gmail.com, qemu-devel@nongnu.org On Thu, Jan 14, 2010 at 10:09:48AM -0800, Richard Henderson wrote: > On 01/14/2010 08:10 AM, Aurelien Jarno wrote: >> On Tue, Jan 05, 2010 at 04:31:11PM -0800, Richard Henderson wrote: >>> A while ago Laurent pointed out that the setcc opcode emitted by >>> the setcond patch had unnecessary REX prefixes. >>> >>> The existing P_REXB internal opcode flag unconditionally emits >>> the REX prefix. Technically it's not needed if the register in >>> question is %al, %bl, %cl, %dl. >>> >>> Eliding the prefix requires splitting the P_REXB flag into two, >>> in order to indicate whether the byte register in question is >>> in the REG or the R/M field. Within TCG, the byte register is >>> in the REG field only for stores. >>> >>> Signed-off-by: Richard Henderson >>> --- >>> tcg/x86_64/tcg-target.c | 46 ++++++++++++++++++++++++++++++---------------- >>> 1 files changed, 30 insertions(+), 16 deletions(-) >>> >>> diff --git a/tcg/x86_64/tcg-target.c b/tcg/x86_64/tcg-target.c >>> index f584c94..8b6b68c 100644 >>> --- a/tcg/x86_64/tcg-target.c >>> +++ b/tcg/x86_64/tcg-target.c >>> @@ -217,9 +217,10 @@ static inline int tcg_target_const_match(tcg_target_long val, >>> #define JCC_JLE 0xe >>> #define JCC_JG 0xf >>> >>> -#define P_EXT 0x100 /* 0x0f opcode prefix */ >>> -#define P_REXW 0x200 /* set rex.w = 1 */ >>> -#define P_REXB 0x400 /* force rex use for byte registers */ >>> +#define P_EXT 0x100 /* 0x0f opcode prefix */ >>> +#define P_REXW 0x200 /* set rex.w = 1 */ >>> +#define P_REXB_R 0x400 /* REG field as byte register */ >>> +#define P_REXB_RM 0x800 /* R/M field as byte register */ >>> >>> static const uint8_t tcg_cond_to_jcc[10] = { >>> [TCG_COND_EQ] = JCC_JE, >>> @@ -234,17 +235,30 @@ static const uint8_t tcg_cond_to_jcc[10] = { >>> [TCG_COND_GTU] = JCC_JA, >>> }; >>> >>> -static inline void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x) >>> +static void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x) >>> { >>> - int rex; >>> - rex = ((opc>> 6)& 0x8) | ((r>> 1)& 0x4) | >>> - ((x>> 2)& 2) | ((rm>> 3)& 1); >>> - if (rex || (opc& P_REXB)) { >>> + int rex = 0; >>> + >>> + rex |= (opc& P_REXW)>> 6; /* REX.W */ >>> + rex |= (r& 8)>> 1; /* REX.R */ >>> + rex |= (x& 8)>> 2; /* REX.X */ >>> + rex |= (rm& 8)>> 3; /* REX.B */ >>> + >>> + /* P_REXB_{R,RM} indicates that the given register is the low byte. >>> + For %[abcd]l we need no REX prefix, but for %{si,di,bp,sp}l we do, >>> + as otherwise the encoding indicates %[abcd]h. Note that the values >>> + that are ORed in merely indicate that the REX byte must be present; >>> + those bits get discarded in output. */ >>> + rex |= (r>= 4 ? opc& P_REXB_R : 0); >>> + rex |= (rm>= 4 ? opc& P_REXB_RM : 0); >>> + >>> + if (rex) { >>> tcg_out8(s, rex | 0x40); >>> } >> >> With the above change, rex can be > 0xff. Not sure it's not a good idea >> to not have an explicit cast when calling tcg_out8(), even if it >> technically works. > > Same as below. > >> >>> - if (opc& P_EXT) >>> + if (opc& P_EXT) { >>> tcg_out8(s, 0x0f); >>> - tcg_out8(s, opc& 0xff); >>> + } >>> + tcg_out8(s, opc); >> >> What's the reason for removing the '& 0xff' part? tcg_out8() takes an uint8_t. > > Yes, and the uint8_t truncates the value just fine. Is there any > particular reason you want to clutter the code with a duplicate > truncation? It might have been reasonable if the function name didn't > quite clearly indicate that a single byte was going to be output... > I have to say I don't really like this way of programming. It seems I agree with gcc people, as they have created the -Wconversion option (ok, it emits tons of warning within QEMU). Moreover here, the second truncation removal is totally unreleated to this patch. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net