From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45280) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bkNYG-0005dB-VJ for qemu-devel@nongnu.org; Wed, 14 Sep 2016 23:41:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bkNYC-0004Or-4k for qemu-devel@nongnu.org; Wed, 14 Sep 2016 23:41:28 -0400 Date: Thu, 15 Sep 2016 11:41:42 +1000 From: David Gibson Message-ID: <20160915014142.GK15077@voom.fritz.box> References: <1473662506-27441-1-git-send-email-nikunj@linux.vnet.ibm.com> <1473662506-27441-16-git-send-email-nikunj@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="F67xnxMyVwCoXwbx" Content-Disposition: inline In-Reply-To: <1473662506-27441-16-git-send-email-nikunj@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH RESEND v2 15/17] target-ppc: add lxvb16x and lxvh8x List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikunj A Dadhania Cc: qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org, benh@kernel.crashing.org --F67xnxMyVwCoXwbx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 12, 2016 at 12:11:44PM +0530, Nikunj A Dadhania wrote: > lxvb16x: Load VSX Vector Byte*16 > lxvh8x: Load VSX Vector Halfword*8 >=20 > Signed-off-by: Nikunj A Dadhania > --- > target-ppc/helper.h | 1 + > target-ppc/mem_helper.c | 6 ++++ > target-ppc/translate/vsx-impl.inc.c | 57 +++++++++++++++++++++++++++++++= ++++++ > target-ppc/translate/vsx-ops.inc.c | 2 ++ > 4 files changed, 66 insertions(+) >=20 > diff --git a/target-ppc/helper.h b/target-ppc/helper.h > index 1bbeac4..6de0db7 100644 > --- a/target-ppc/helper.h > +++ b/target-ppc/helper.h > @@ -298,6 +298,7 @@ DEF_HELPER_3(lvebx, void, env, avr, tl) > DEF_HELPER_3(lvehx, void, env, avr, tl) > DEF_HELPER_3(lvewx, void, env, avr, tl) > DEF_HELPER_1(bswap32x2, i64, i64) > +DEF_HELPER_1(bswap16x4, i64, i64) > DEF_HELPER_3(stvebx, void, env, avr, tl) > DEF_HELPER_3(stvehx, void, env, avr, tl) > DEF_HELPER_3(stvewx, void, env, avr, tl) > diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c > index a56051a..608803f 100644 > --- a/target-ppc/mem_helper.c > +++ b/target-ppc/mem_helper.c > @@ -290,6 +290,12 @@ uint64_t helper_bswap32x2(uint64_t x) > return deposit64((x >> 32), 32, 32, (x)); > } > =20 > +uint64_t helper_bswap16x4(uint64_t x) > +{ > + uint64_t m =3D 0x00ff00ff00ff00ffull; > + return ((x & m) << 8) | ((x >> 8) & m); > +} This doesn't seem to match the bswap32x2 function above. bswap32x2 just swaps the two 32-bit words in the 64-bit word. This one swaps the bytes in each individual 16 bit work in the 64-bit word. I suspect the bswap32x2 is wrong, which would explain why the previous patch didn't seem to make sense. > + > #undef HI_IDX > #undef LO_IDX > =20 > diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/v= sx-impl.inc.c > index e3374df..caa6660 100644 > --- a/target-ppc/translate/vsx-impl.inc.c > +++ b/target-ppc/translate/vsx-impl.inc.c > @@ -108,6 +108,63 @@ static void gen_lxvw4x(DisasContext *ctx) > tcg_temp_free(EA); > } > =20 > +static void gen_lxvb16x(DisasContext *ctx) > +{ > + TCGv EA; > + TCGv_i64 xth =3D cpu_vsrh(xT(ctx->opcode)); > + TCGv_i64 xtl =3D cpu_vsrl(xT(ctx->opcode)); > + > + if (unlikely(!ctx->vsx_enabled)) { > + gen_exception(ctx, POWERPC_EXCP_VSXU); > + return; > + } > + gen_set_access_type(ctx, ACCESS_INT); > + EA =3D tcg_temp_new(); > + gen_addr_reg_index(ctx, EA); > + if (ctx->le_mode) { > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ); > + tcg_gen_addi_tl(EA, EA, 8); > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ); > + } else { > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ); > + gen_helper_bswap32x2(xth, xth); I really don't understand how a bswap32x2 helps here, either as it's defined now, or as I suspect it should be defined by analogy with bswap16x4. > + tcg_gen_addi_tl(EA, EA, 8); > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ); > + gen_helper_bswap32x2(xtl, xtl); Also.. if I'm understanding the ISA correctly, this instruction loads subsequent higher-address bytes into subsequent (i.e. less signficant, since IBM uses BE bit/byte numbering) bytes in the vector. Doesn't that mean you want a BE load in all cases, not just the LE guest case? > + } > + tcg_temp_free(EA); > +} > + > +static void gen_lxvh8x(DisasContext *ctx) > +{ > + TCGv EA; > + TCGv_i64 xth =3D cpu_vsrh(xT(ctx->opcode)); > + TCGv_i64 xtl =3D cpu_vsrl(xT(ctx->opcode)); > + > + if (unlikely(!ctx->vsx_enabled)) { > + gen_exception(ctx, POWERPC_EXCP_VSXU); > + return; > + } > + gen_set_access_type(ctx, ACCESS_INT); > + EA =3D tcg_temp_new(); > + gen_addr_reg_index(ctx, EA); > + > + if (ctx->le_mode) { > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ); > + gen_helper_bswap16x4(xth, xth); > + tcg_gen_addi_tl(EA, EA, 8); > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ); > + gen_helper_bswap16x4(xtl, xtl); > + } else { > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ); > + gen_helper_bswap32x2(xth, xth); > + tcg_gen_addi_tl(EA, EA, 8); > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ); > + gen_helper_bswap32x2(xtl, xtl); Again, I think you want a BE load in both cases, and the bswap32x2 makes no sense to me. > + } > + tcg_temp_free(EA); > +} > + > #define VSX_STORE_SCALAR(name, operation) \ > static void gen_##name(DisasContext *ctx) \ > { \ > diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vs= x-ops.inc.c > index 414b73b..598b349 100644 > --- a/target-ppc/translate/vsx-ops.inc.c > +++ b/target-ppc/translate/vsx-ops.inc.c > @@ -7,6 +7,8 @@ GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2= _VSX207), > GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX), > +GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE, PPC2_ISA300), > +GEN_HANDLER_E(lxvb16x, 0x1F, 0x0C, 0x1B, 0, PPC_NONE, PPC2_ISA300), > =20 > GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX), > GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300), --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --F67xnxMyVwCoXwbx Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBAgAGBQJX2fxWAAoJEGw4ysog2bOS3oYP/1SlaDJSXrSnkVkeD0xHn36i g736PkvNIP8puH/n8lZa/PNxtBcAkqP4sX5/Zztxyvs1MBJc+19bxOwHg2sgoM7P c8gxhdEwXRLcCUDsse2mH1BqEUro5iD4x83cSblvsMTUIKyyTGxEk31JYo/iK2dQ f25qMZMegdID+DzIezgAZDjoAidC+XLqPXT2rm86hTsue5nbiEDU9zs1NjnS+aMJ uMiRfrv9P2eOi69+IPE2d1weG1M01aDsfg16g+ZU9xCp1xeooAJeM25KM8Lmou7k 2kJamHuJuSlpE3q+DBHsWhFUd0dN4h8pvmRJ62Ax1jNDgpFFAmMblIhY/6obr3oJ p/mY4YUabf83LvAJJhQkE9jzlQUwFBOqfsezaBIwKEE1bj3GDzIY7G/pXP85slKP w9Sl/DEkVZSNxseP86ugeB8R8vUXzvoWAoMFgQoK+rGDBkB7kyNT8P4SCnC4dCGQ jCSnorfO6DDrC+H18DBWKoABxKEPPYqyqUz5Pl3GcOeJRZak/8UJT6tc3izBuETi PkrvL6wwmyL0oklrrzN0wxT80sW8er9kUG9k3wUuwndcrHZVeuZkxQsXcLKRnkH0 LNg3ha9xLn8HuYAXRgV+UQHm8EnxyvQ1pXezuL0hJ+2E8s/n6qTqj2O3XaTxKdfS OluLuofWnFAjS/eUd6Lg =AVRW -----END PGP SIGNATURE----- --F67xnxMyVwCoXwbx--