From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43327) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bltC2-0003hG-Kl for qemu-devel@nongnu.org; Mon, 19 Sep 2016 03:40:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bltBy-0006o0-J4 for qemu-devel@nongnu.org; Mon, 19 Sep 2016 03:40:45 -0400 Date: Mon, 19 Sep 2016 16:50:37 +1000 From: David Gibson Message-ID: <20160919065037.GF20488@umbus> References: <1474023111-11992-1-git-send-email-nikunj@linux.vnet.ibm.com> <1474023111-11992-3-git-send-email-nikunj@linux.vnet.ibm.com> <20160919061934.GC20488@umbus> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="lIrNkN/7tmsD/ALM" Content-Disposition: inline In-Reply-To: <20160919061934.GC20488@umbus> Subject: Re: [Qemu-devel] [PATCH v3 2/5] target-ppc: improve lxvw4x implementation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikunj A Dadhania Cc: qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org, benh@kernel.crashing.org --lIrNkN/7tmsD/ALM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 19, 2016 at 04:19:34PM +1000, David Gibson wrote: > On Fri, Sep 16, 2016 at 04:21:48PM +0530, Nikunj A Dadhania wrote: > > Load 8byte at a time and manipulate. > >=20 > > Signed-off-by: Nikunj A Dadhania > > --- > > target-ppc/helper.h | 1 + > > target-ppc/mem_helper.c | 5 +++++ > > target-ppc/translate/vsx-impl.inc.c | 19 +++++-------------- > > 3 files changed, 11 insertions(+), 14 deletions(-) > >=20 > > diff --git a/target-ppc/helper.h b/target-ppc/helper.h > > index 966f2ce..9f6705d 100644 > > --- a/target-ppc/helper.h > > +++ b/target-ppc/helper.h > > @@ -297,6 +297,7 @@ DEF_HELPER_2(mtvscr, void, env, avr) > > DEF_HELPER_3(lvebx, void, env, avr, tl) > > DEF_HELPER_3(lvehx, void, env, avr, tl) > > DEF_HELPER_3(lvewx, void, env, avr, tl) > > +DEF_HELPER_1(deposit32x2, i64, i64) > > DEF_HELPER_3(stvebx, void, env, avr, tl) > > DEF_HELPER_3(stvehx, void, env, avr, tl) > > DEF_HELPER_3(stvewx, void, env, avr, tl) > > diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c > > index 6548715..86e493e 100644 > > --- a/target-ppc/mem_helper.c > > +++ b/target-ppc/mem_helper.c > > @@ -285,6 +285,11 @@ STVE(stvewx, cpu_stl_data_ra, bswap32, u32) > > #undef I > > #undef LVE > > =20 > > +uint64_t helper_deposit32x2(uint64_t x) > > +{ > > + return deposit64((x >> 32), 32, 32, (x)); > > +} >=20 > It seems a shame to drop out to a helper for something this simple. > How hard would it be to implement this.. wordswap, I guess you'd call > it.. in tcg ops? >=20 > I'm also not particularly fond of the deposit32x2 name, though a > better one doesn't quickly come to mind. >=20 > > + > > #undef HI_IDX > > #undef LO_IDX > > =20 > > diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate= /vsx-impl.inc.c > > index eee6052..df278df 100644 > > --- a/target-ppc/translate/vsx-impl.inc.c > > +++ b/target-ppc/translate/vsx-impl.inc.c > > @@ -75,7 +75,6 @@ static void gen_lxvdsx(DisasContext *ctx) > > static void gen_lxvw4x(DisasContext *ctx) > > { > > TCGv EA; > > - TCGv_i64 tmp; > > TCGv_i64 xth =3D cpu_vsrh(xT(ctx->opcode)); > > TCGv_i64 xtl =3D cpu_vsrl(xT(ctx->opcode)); > > if (unlikely(!ctx->vsx_enabled)) { > > @@ -84,22 +83,14 @@ static void gen_lxvw4x(DisasContext *ctx) > > } > > gen_set_access_type(ctx, ACCESS_INT); > > EA =3D tcg_temp_new(); > > - tmp =3D tcg_temp_new_i64(); > > =20 > > gen_addr_reg_index(ctx, EA); > > - gen_qemu_ld32u_i64(ctx, tmp, EA); > > - tcg_gen_addi_tl(EA, EA, 4); > > - gen_qemu_ld32u_i64(ctx, xth, EA); > > - tcg_gen_deposit_i64(xth, xth, tmp, 32, 32); > > - > > - tcg_gen_addi_tl(EA, EA, 4); > > - gen_qemu_ld32u_i64(ctx, tmp, EA); > > - tcg_gen_addi_tl(EA, EA, 4); > > - gen_qemu_ld32u_i64(ctx, xtl, EA); > > - tcg_gen_deposit_i64(xtl, xtl, tmp, 32, 32); > > - > > + tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_LEQ); > > + gen_helper_deposit32x2(xth, xth); > > + tcg_gen_addi_tl(EA, EA, 8); > > + tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_LEQ); > > + gen_helper_deposit32x2(xtl, xtl); =2E.and I think this is wrong for BE mode. The deposit32x2 will get the words in the right order, but the bytes within each word will be wrong because of the LE mode load on a BE setup. > > tcg_temp_free(EA); > > - tcg_temp_free_i64(tmp); > > } > > =20 > > #define VSX_STORE_SCALAR(name, operation) \ >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --lIrNkN/7tmsD/ALM Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJX34q5AAoJEGw4ysog2bOSSIoP/0LPcfPxvSIjv6zPAabRxtqu k+7t8Bsj05CoBo0CkSJRSWBrQ56nAqBViNA2Ss4x7eYQVl6PPihQJ02NEkfao0Mg QqGiGLmhEZYrjo/5gaZkGtx+/Xlkx2kn7nHxlOtxEzPIepoXsvQKahVLHNu0vlfk qPda1uXbBzlaQbFbjDJSMei2KZxdJ1DuEaCLYoxSQGbUsxGlf8EZJjdQRTXRFK8X x7soWbJgDJ3cZskW3DRzmiCGMJX7bD76dAy5KyzauVAi90vpRSpCFuz+2WXJbknA QUE5sGdtXs8azvo3OS6CIab5WuDj+y9TRsnYVNoCZAnoy5yBEgwOaILBR+fT67+3 eHLRl59PD+ON8/3za+OGz1oe0tgPUKcCf3KV1cf/fBOEH5IlNUrMDN52VL41WRv0 3TqNiUGQRfQYeRVxddvchFN4NZeUQ/rcOEJxf00hbhEKpJe/7044CKO5E0uJY/HH +4xbltiIvRbsZgaDj0uBp+OufQYA/sVVvgDo6jupfPaW56As6NOnTrvfb+EMikw7 mP5XVjaL81KwUMdC5Xcr5pDApI1DucCPtDVi7sYEb8pp68TMd1nP28cRkhN+lsXa E5XEbaTURu2FMzTKWwkyDokORnDIj/WalvDQM6inW9SxC1elkMKCjfUA2e27c5zD ZoxPyH3jGYChdwTXYsi9 =X/Bl -----END PGP SIGNATURE----- --lIrNkN/7tmsD/ALM--