From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Lhk2s-0000iH-9p for qemu-devel@nongnu.org; Thu, 12 Mar 2009 08:29:54 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Lhk2q-0000gf-5I for qemu-devel@nongnu.org; Thu, 12 Mar 2009 08:29:53 -0400 Received: from [199.232.76.173] (port=52318 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Lhk2p-0000gR-TM for qemu-devel@nongnu.org; Thu, 12 Mar 2009 08:29:52 -0400 Received: from mx20.gnu.org ([199.232.41.8]:26967) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Lhk2p-0000PA-L6 for qemu-devel@nongnu.org; Thu, 12 Mar 2009 08:29:51 -0400 Received: from mail.codesourcery.com ([65.74.133.4]) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Lhk2n-0000mC-PU for qemu-devel@nongnu.org; Thu, 12 Mar 2009 08:29:50 -0400 From: Paul Brook Subject: Re: [Qemu-devel] [PATCH 0/9] tcg: reorganize bswap* functions Date: Thu, 12 Mar 2009 12:29:46 +0000 References: <20090311170836.GD32521@hall.aurel32.net> In-Reply-To: <20090311170836.GD32521@hall.aurel32.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200903121229.47365.paul@codesourcery.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Aurelien Jarno > This patch series tries to reorganize a bit the bswap* TCG functions. In principle this looks ok, however several implementation issues: > +/* Note: we assume the six high bytes are set to zero */ > +static inline void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg) > +{ > + tcg_gen_bswap16_i32(TCGV_LOW(ret), TCGV_LOW(arg)); > +} > + > +/* Note: we assume the four high bytes are set to zero */ > +static inline void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg) > +{ > + tcg_gen_bswap32_i32(TCGV_LOW(ret), TCGV_LOW(arg)); > +} I think we want to preserve the zero extension of the value, i.e. you want= =20 something along the lines of: if (!TCGV_EQUAL_I64(ret, arg)) tcg_gen_movi_i32(TCGV_HIGH(ret), 0); =2D I'm not sure whether it's more efficient to do movi(ret, 0) or mov(ret,= =20 arg). With a bit of luck they'll both be optimized away most of the time. > +/* Note: we assume the four high bytes are set to zero */ > +static inline void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg) > +{ >... > + tcg_gen_shli_i64(t0, arg, 24); Likewise this is missing a mask operation. ext32u_i64(t0, t0) is probably t= he=20 most efficient way). > +++ b/tcg/i386/tcg-target.c > + =A0 =A0case INDEX_op_bswap16_i32: > + =A0 =A0 =A0 =A0tcg_out8(s, 0x66); > + =A0 =A0 =A0 =A0tcg_out_modrm(s, 0xc1, SHIFT_ROL, args[0]); > + =A0 =A0 =A0 =A0tcg_out8(s, 8); > + =A0 =A0{ INDEX_op_bswap16_i32, { "r", "0" } }, This is wrong. The r/m field of the modrm byte can only address the first 4= =20 registers (AL-DL). Values 4-7 address te second bytes of these registers=20 (a.k.a. AH-DH). I suspect you need to use the "q" constraint. > +++ b/tcg/x86_64/tcg-target.c > + =A0 =A0case INDEX_op_bswap16_i32: > + =A0 =A0case INDEX_op_bswap16_i64: > + =A0 =A0 =A0 =A0tcg_out8(s, 0x66); > + =A0 =A0 =A0 =A0tcg_out_modrm(s, 0xc1, SHIFT_ROL, args[0]); > + =A0 =A0 =A0 =A0tcg_out8(s, 8); > + =A0 =A0 =A0 =A0break; You need to use tcg_out_opc here to get REX prefixes. You need P_REXB to av= oid=20 the legacy encoding issues mentioned above, and the high bit of the r/m fie= ld=20 also goes in the REX byte. Paul