From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49757) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bpR36-0000nC-Tc for qemu-devel@nongnu.org; Wed, 28 Sep 2016 22:26:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bpR32-0002AL-KH for qemu-devel@nongnu.org; Wed, 28 Sep 2016 22:26:12 -0400 Date: Thu, 29 Sep 2016 11:51:52 +1000 From: David Gibson Message-ID: <20160929015152.GC8390@umbus.fritz.box> References: <1475088120-20244-1-git-send-email-nikunj@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="oJ71EGRlYNjSvfq7" Content-Disposition: inline In-Reply-To: <1475088120-20244-1-git-send-email-nikunj@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikunj A Dadhania Cc: qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org, benh@kernel.crashing.org --oJ71EGRlYNjSvfq7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Sep 29, 2016 at 12:11:51AM +0530, Nikunj A Dadhania wrote: > This series contains 7 new instructions for POWER9 ISA3.0 > Use newer qemu load/store tcg helpers and optimize stxvw4x and lxvw4x. >=20 > GCC was adding epilogue for every VSX instructions causing change in=20 > behaviour. For testing the load vector instructions used mfvsrld/mfvsrd= =20 > for loading vsr to register. And for testing store vector, used mtvsrdd= =20 > instructions. This helped in getting rid of the epilogue added by gcc. >=20 > Patches: > 01: mfvsrld: Move From VSR Lower Doubleword > 02: mtvsrdd: Move To VSR Double Doubleword > 03: mtvsrws: Move To VSR Word & Splat > 05: lxvw4x: improve implementation > 05: stxv4x: improve implementation > 06: lxvh8x: Load VSX Vector Halfword*8 > 07: stxvh8x: Store VSX Vector Halfword*8 > 08: lxvb16x: Load VSX Vector Byte*16 > 09: stxvb16x: Store VSX Vector Byte*16 I've applied everything that rth reviewed to ppc-for-2.8. I've tweaked the ascii art diagrams describing the endianness transformations. Specifically I removed the within-element spaces for each element on the vector (not memory) side. That's to emphasise the fact that in-register there's no endianness, just numbers. >=20 > Changelog: > v4: > * Added gen_bswap16x8 inline for lxvh8x and stxvh8x in tcg > * Dropped helper_bswap16x4 > * Use temporaries in stxvh8x and not clobber the register >=20 > v3: > * Added 3 new VSR instructions. > * Fixed all the vector load/store instructions for BE/LE. > * Added detailed commit messages to patches. > * Dropped deposit32x2 and implemented it using tcg ops >=20 > v2:=20 > * Fix lxvw4x/stxv4x translation as LE/BE were both similar=20 > one in tcg and other as helper > * Rename bswap32x2 to deposit32x2 as it does not need to=20 > swap content(32bit) > * stxvh8x had a bug as David suggested. >=20 > v1:=20 > * More load/store cleanups in byte reverse routines > * ld64/st64 converted to newer macro and updated call sites > * Cleanup load with reservation and store conditional > * Return invalid random for darn instruction >=20 > v0: > * darn - read /dev/random to get the random number > * xxspltib - make is PPC64 only > * Consolidate load/store operations and use macros to generate qemu_st/ld > * Simplify load/store vsx endian manipulation >=20 > Nikunj A Dadhania (6): > target-ppc: improve lxvw4x implementation > target-ppc: improve stxvw4x implementation > target-ppc: add lxvh8x instruction > target-ppc: add stxvh8x instruction > target-ppc: add lxvb16x instruction > target-ppc: add stxvb16x instruction >=20 > Ravi Bangoria (3): > target-ppc: Implement mfvsrld instruction > target-ppc: Implement mtvsrdd instruction > target-ppc: Implement mtvsrws instruction >=20 > target-ppc/translate/vsx-impl.inc.c | 238 ++++++++++++++++++++++++++++++= ++---- > target-ppc/translate/vsx-ops.inc.c | 7 ++ > 2 files changed, 221 insertions(+), 24 deletions(-) >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --oJ71EGRlYNjSvfq7 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJX7HO2AAoJEGw4ysog2bOSpi0QAK6LqpDmFV39YFnI0ICkjEDj SrRfdUsRbK3GdyaE3ArPVYGnZgNOpEaAx4mIC80/04q3vWIaeHg5QaMPs9+/JKpB HdG9FQCpJPIo/x9H7FHioF79sL9uqFJPclc1ISRmCo7sroqvjw5lWGv8uODF+I0b gvB1A3cSKU2qSt0g5cE8fTmsdizpE9pcym71af413kaw95GD+rPrWXL7yyPFTY/f BDH2bc1m7SxHSnR6aSU7UfvJqjZYqFORk7HGbc3xxPf6ezYrlO6THLs1GFU73fKx KtLY6pbusgsh1Pnxej0gycegTHwbZxSBmDPN6IodWFlVh59EdntbNmcRe3uZfIf4 qE7AslF3dqyURJCv6Jehh7W4k0evE4Ee4M+eRhcBH5gwBpAdt+m9I0J9X682cuJ+ nVo8Ug7SnQBn/4PFbWDoDceUvyd3u5hM+J6+dE80rBvqKsyfsF3nm2K/m8BIBeNW vKbNLbF8XqhggkaZdGhH2KT4vaGC45Q8Lo3YRvy4YJS51TRnHSpjdnFX5nhue6GK HswLwjJ8lNsXnzswPY9WWYKMXBWIDD8rg6Qq49aOvujc8NLmCt815sC2Pqj6DMRl PU48R+AqGQm1DixaQMHmonMV6scdPe6bK5s6GKIiTA4JCvH0YIKi+rfTnT4PWIVe T0vohrHoxvbvuKdzHDtl =DbsT -----END PGP SIGNATURE----- --oJ71EGRlYNjSvfq7--