From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43458) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bltCU-00040K-QK for qemu-devel@nongnu.org; Mon, 19 Sep 2016 03:41:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bltCQ-0006tF-Pg for qemu-devel@nongnu.org; Mon, 19 Sep 2016 03:41:13 -0400 Date: Mon, 19 Sep 2016 16:33:42 +1000 From: David Gibson Message-ID: <20160919063342.GD20488@umbus> References: <1474023111-11992-1-git-send-email-nikunj@linux.vnet.ibm.com> <1474023111-11992-5-git-send-email-nikunj@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ylS2wUBXLOxYXZFQ" Content-Disposition: inline In-Reply-To: <1474023111-11992-5-git-send-email-nikunj@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH v3 4/5] target-ppc: add lxvh8x and stxvh8x List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikunj A Dadhania Cc: qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org, benh@kernel.crashing.org --ylS2wUBXLOxYXZFQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Sep 16, 2016 at 04:21:50PM +0530, Nikunj A Dadhania wrote: > lxvh8x: Load VSX Vector Halfword*8 > stxvh8x: Store VSX Vector Halfword*8 >=20 > Signed-off-by: Nikunj A Dadhania > --- > target-ppc/helper.h | 1 + > target-ppc/mem_helper.c | 6 ++++ > target-ppc/translate/vsx-impl.inc.c | 59 +++++++++++++++++++++++++++++++= ++++++ > target-ppc/translate/vsx-ops.inc.c | 2 ++ > 4 files changed, 68 insertions(+) >=20 > diff --git a/target-ppc/helper.h b/target-ppc/helper.h > index 9f6705d..ae91c3b 100644 > --- a/target-ppc/helper.h > +++ b/target-ppc/helper.h > @@ -298,6 +298,7 @@ DEF_HELPER_3(lvebx, void, env, avr, tl) > DEF_HELPER_3(lvehx, void, env, avr, tl) > DEF_HELPER_3(lvewx, void, env, avr, tl) > DEF_HELPER_1(deposit32x2, i64, i64) > +DEF_HELPER_1(bswap16x4, i64, i64) > DEF_HELPER_3(stvebx, void, env, avr, tl) > DEF_HELPER_3(stvehx, void, env, avr, tl) > DEF_HELPER_3(stvewx, void, env, avr, tl) > diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c > index 86e493e..26d5617 100644 > --- a/target-ppc/mem_helper.c > +++ b/target-ppc/mem_helper.c > @@ -290,6 +290,12 @@ uint64_t helper_deposit32x2(uint64_t x) > return deposit64((x >> 32), 32, 32, (x)); > } > =20 > +uint64_t helper_bswap16x4(uint64_t x) > +{ > + uint64_t m =3D 0x00ff00ff00ff00ffull; > + return ((x & m) << 8) | ((x >> 8) & m); > +} > + > #undef HI_IDX > #undef LO_IDX > =20 > diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/v= sx-impl.inc.c > index a3868f6..9e7588d 100644 > --- a/target-ppc/translate/vsx-impl.inc.c > +++ b/target-ppc/translate/vsx-impl.inc.c > @@ -93,6 +93,36 @@ static void gen_lxvw4x(DisasContext *ctx) > tcg_temp_free(EA); > } > =20 > +static void gen_lxvh8x(DisasContext *ctx) > +{ > + TCGv EA; > + TCGv_i64 xth =3D cpu_vsrh(xT(ctx->opcode)); > + TCGv_i64 xtl =3D cpu_vsrl(xT(ctx->opcode)); > + > + if (unlikely(!ctx->vsx_enabled)) { > + gen_exception(ctx, POWERPC_EXCP_VSXU); > + return; > + } > + gen_set_access_type(ctx, ACCESS_INT); > + EA =3D tcg_temp_new(); > + gen_addr_reg_index(ctx, EA); > + > + if (ctx->le_mode) { > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ); > + gen_helper_bswap16x4(xth, xth); > + tcg_gen_addi_tl(EA, EA, 8); > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ); > + gen_helper_bswap16x4(xtl, xtl); > + } else { > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ); > + gen_helper_deposit32x2(xth, xth); > + tcg_gen_addi_tl(EA, EA, 8); > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ); > + gen_helper_deposit32x2(xtl, xtl); The BE path still looks wrong. In fact, it's almost as wrong as it could possibly be. The LE load means the bytes of each halfword will be in the wrong order. It also means lower-address halfwords will end up in less significant halfwords internally, which is also wrong. Finally the deposit32x2 will get pairs of halfwords in the right relative order, but the order of the halfwords within the pair will still be wrong. I really think you need to make yourself some testcases for these... I think what you actually want is a BE load in all cases (which gets the halfword order correct), then apply the 16x4 bswap if and only if you're in LE mode, to get the bytes within each halfword in the right order. > + } > + tcg_temp_free(EA); > +} > + > #define VSX_STORE_SCALAR(name, operation) \ > static void gen_##name(DisasContext *ctx) \ > { \ > @@ -152,6 +182,35 @@ static void gen_stxvw4x(DisasContext *ctx) > tcg_temp_free(EA); > } > =20 > +static void gen_stxvh8x(DisasContext *ctx) > +{ > + TCGv_i64 xsh =3D cpu_vsrh(xS(ctx->opcode)); > + TCGv_i64 xsl =3D cpu_vsrl(xS(ctx->opcode)); > + TCGv EA; > + > + if (unlikely(!ctx->vsx_enabled)) { > + gen_exception(ctx, POWERPC_EXCP_VSXU); > + return; > + } > + gen_set_access_type(ctx, ACCESS_INT); > + EA =3D tcg_temp_new(); > + gen_addr_reg_index(ctx, EA); > + if (ctx->le_mode) { > + gen_helper_bswap16x4(xsh, xsh); > + tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ); > + tcg_gen_addi_tl(EA, EA, 8); > + gen_helper_bswap16x4(xsl, xsl); > + tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ); > + } else { > + gen_helper_deposit32x2(xsh, xsh); > + tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_LEQ); > + tcg_gen_addi_tl(EA, EA, 8); > + gen_helper_deposit32x2(xsl, xsl); > + tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_LEQ); Same problems as the load side. > + } > + tcg_temp_free(EA); > +} > + > #define MV_VSRW(name, tcgop1, tcgop2, target, source) \ > static void gen_##name(DisasContext *ctx) \ > { \ > diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vs= x-ops.inc.c > index 414b73b..21f9064 100644 > --- a/target-ppc/translate/vsx-ops.inc.c > +++ b/target-ppc/translate/vsx-ops.inc.c > @@ -7,6 +7,7 @@ GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2= _VSX207), > GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX), > +GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE, PPC2_ISA300), > =20 > GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300), > @@ -15,6 +16,7 @@ GEN_HANDLER_E(stxsiwx, 0x1F, 0xC, 0x04, 0, PPC_NONE, PP= C2_VSX207), > GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207), > GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX), > +GEN_HANDLER_E(stxvh8x, 0x1F, 0x0C, 0x1D, 0, PPC_NONE, PPC2_ISA300), > =20 > GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX2= 07), > GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX2= 07), --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --ylS2wUBXLOxYXZFQ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJX34bCAAoJEGw4ysog2bOSqawQANcYx+U/+w9Rkjbos0M5wFkY lTlTZ3nSYIICgkBLzI6fvY0K9WtYSV3dw/krt71Ezrxa5sS6P3YvVrzoNfEwKq/C UCqeXkQ6v9ComCp9sBhlJ/xpNhBC5k9UtyER4bkAZEmTsfmCqAmLDQzwyiv+99Y6 fuvsb222RsXonnlZ5P59M3ifbUgTHKYa2NTP4jTeK64H4NWq9+FCbd1usOVCv3fa 1HR1q8zt7Zgc72kkpHB5I8hePoOfIOCYr2Ix/VknA855NNld5QejOyuO7oND6Jz4 gnGwmIMusZdLQr/OOqJRCRyjSpmNGUFLfgXcdyHKrpdawS3lXM7b7MBFP6M1/HNv mYmBn/W8dwrRPe9noY9PDYyDx6ueLAfYCIkZe0gDAuJKYB8v0p2jl7IFGczZyOLN 882DqoKeLNMyLXW1G3g3TNyQeyWa7LLYIbBX4U8nDpB2I3yT+jn6OVrAzluQP8pP E220QcEMz3QTTDJBGDVMaHfVxdm4MhYx9Mjp0T49MTL0QZXNum25NWTJHSa93VwO blOTD20WYBXWbft0wcAjC5iX5vi9/Kp6tO1cqwAII/lWslaVRNu2SjW1+gO8MmUS U0E5DoqPFf7fPI0Y3EEhIEHxQcy8W+sC5T+Gov29Mp6P48l62x3wzfD6/hT3EZGJ m6iDyyoxKM2hCtRmp69p =KGQA -----END PGP SIGNATURE----- --ylS2wUBXLOxYXZFQ--