From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1NVUmq-0008AB-Pj
	for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:16 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1NVUmp-00087o-K0
	for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:16 -0500
Received: from [199.232.76.173] (port=38072 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1NVUmp-00087V-DB
	for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:15 -0500
Received: from hall.aurel32.net ([88.191.82.174]:39600)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <aurelien@aurel32.net>)
	id 1NVUmo-0007MX-Lu
	for qemu-devel@nongnu.org; Thu, 14 Jan 2010 13:51:15 -0500
Date: Thu, 14 Jan 2010 19:51:11 +0100
From: Aurelien Jarno <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH 2/2] tcg-x86_64: Avoid unnecessary REX.B
	prefixes.
Message-ID: <20100114185111.GC23053@hall.aurel32.net>
References: <20100106010537.9C9D2CBB@are.twiddle.net>
	<20100114161003.GF16630@volta.aurel32.net>
	<4B4F5DEC.9090909@twiddle.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
In-Reply-To: <4B4F5DEC.9090909@twiddle.net>
Sender: Aurelien Jarno <aurelien@aurel32.net>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Richard Henderson <rth@twiddle.net>
Cc: laurent.desnogues@gmail.com, qemu-devel@nongnu.org

On Thu, Jan 14, 2010 at 10:09:48AM -0800, Richard Henderson wrote:
> On 01/14/2010 08:10 AM, Aurelien Jarno wrote:
>> On Tue, Jan 05, 2010 at 04:31:11PM -0800, Richard Henderson wrote:
>>> A while ago Laurent pointed out that the setcc opcode emitted by
>>> the setcond patch had unnecessary REX prefixes.
>>>
>>> The existing P_REXB internal opcode flag unconditionally emits
>>> the REX prefix.  Technically it's not needed if the register in
>>> question is %al, %bl, %cl, %dl.
>>>
>>> Eliding the prefix requires splitting the P_REXB flag into two,
>>> in order to indicate whether the byte register in question is
>>> in the REG or the R/M field.  Within TCG, the byte register is
>>> in the REG field only for stores.
>>>
>>> Signed-off-by: Richard Henderson<rth@twiddle.net>
>>> ---
>>>   tcg/x86_64/tcg-target.c |   46 ++++++++++++++++++++++++++++++----------------
>>>   1 files changed, 30 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/tcg/x86_64/tcg-target.c b/tcg/x86_64/tcg-target.c
>>> index f584c94..8b6b68c 100644
>>> --- a/tcg/x86_64/tcg-target.c
>>> +++ b/tcg/x86_64/tcg-target.c
>>> @@ -217,9 +217,10 @@ static inline int tcg_target_const_match(tcg_target_long val,
>>>   #define JCC_JLE 0xe
>>>   #define JCC_JG  0xf
>>>
>>> -#define P_EXT   0x100 /* 0x0f opcode prefix */
>>> -#define P_REXW  0x200 /* set rex.w = 1 */
>>> -#define P_REXB  0x400 /* force rex use for byte registers */
>>> +#define P_EXT		0x100		/* 0x0f opcode prefix */
>>> +#define P_REXW		0x200		/* set rex.w = 1 */
>>> +#define P_REXB_R	0x400		/* REG field as byte register */
>>> +#define P_REXB_RM	0x800		/* R/M field as byte register */
>>>
>>>   static const uint8_t tcg_cond_to_jcc[10] = {
>>>       [TCG_COND_EQ] = JCC_JE,
>>> @@ -234,17 +235,30 @@ static const uint8_t tcg_cond_to_jcc[10] = {
>>>       [TCG_COND_GTU] = JCC_JA,
>>>   };
>>>
>>> -static inline void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
>>> +static void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
>>>   {
>>> -    int rex;
>>> -    rex = ((opc>>  6)&  0x8) | ((r>>  1)&  0x4) |
>>> -        ((x>>  2)&  2) | ((rm>>  3)&  1);
>>> -    if (rex || (opc&  P_REXB)) {
>>> +    int rex = 0;
>>> +
>>> +    rex |= (opc&  P_REXW)>>  6;		/* REX.W */
>>> +    rex |= (r&  8)>>  1;		/* REX.R */
>>> +    rex |= (x&  8)>>  2;		/* REX.X */
>>> +    rex |= (rm&  8)>>  3;		/* REX.B */
>>> +
>>> +    /* P_REXB_{R,RM} indicates that the given register is the low byte.
>>> +       For %[abcd]l we need no REX prefix, but for %{si,di,bp,sp}l we do,
>>> +       as otherwise the encoding indicates %[abcd]h.  Note that the values
>>> +       that are ORed in merely indicate that the REX byte must be present;
>>> +       those bits get discarded in output.  */
>>> +    rex |= (r>= 4 ? opc&  P_REXB_R : 0);
>>> +    rex |= (rm>= 4 ? opc&  P_REXB_RM : 0);
>>> +
>>> +    if (rex) {
>>>           tcg_out8(s, rex | 0x40);
>>>       }
>>
>> With the above change, rex can be > 0xff. Not sure it's not a good idea
>> to not have an explicit cast when calling tcg_out8(), even if it
>> technically works.
>
> Same as below.
>
>>
>>> -    if (opc&  P_EXT)
>>> +    if (opc&  P_EXT) {
>>>           tcg_out8(s, 0x0f);
>>> -    tcg_out8(s, opc&  0xff);
>>> +    }
>>> +    tcg_out8(s, opc);
>>
>> What's the reason for removing the '&  0xff' part? tcg_out8() takes an uint8_t.
>
> Yes, and the uint8_t truncates the value just fine.  Is there any  
> particular reason you want to clutter the code with a duplicate  
> truncation?  It might have been reasonable if the function name didn't  
> quite clearly indicate that a single byte was going to be output...
>

I have to say I don't really like this way of programming. It seems I
agree with gcc people, as they have created the -Wconversion option (ok,
it emits tons of warning within QEMU).

Moreover here, the second truncation removal is totally unreleated to 
this patch.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net