From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37296) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bpSsF-0005YA-L6 for qemu-devel@nongnu.org; Thu, 29 Sep 2016 00:23:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bpSsD-0007aq-4z for qemu-devel@nongnu.org; Thu, 29 Sep 2016 00:23:06 -0400 Date: Thu, 29 Sep 2016 13:55:33 +1000 From: David Gibson Message-ID: <20160929035533.GJ8390@umbus.fritz.box> References: <1475040687-27523-1-git-send-email-nikunj@linux.vnet.ibm.com> <1475040687-27523-5-git-send-email-nikunj@linux.vnet.ibm.com> <20160929013841.GB8390@umbus.fritz.box> <87eg43v0dl.fsf@abhimanyu.i-did-not-set--mail-host-address--so-tickle-me> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="BEa57a89OpeoUzGD" Content-Disposition: inline In-Reply-To: <87eg43v0dl.fsf@abhimanyu.i-did-not-set--mail-host-address--so-tickle-me> Subject: Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikunj A Dadhania Cc: qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org, benh@kernel.crashing.org --BEa57a89OpeoUzGD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Sep 29, 2016 at 09:11:10AM +0530, Nikunj A Dadhania wrote: > David Gibson writes: >=20 > > [ Unknown signature status ] > > On Wed, Sep 28, 2016 at 11:01:22AM +0530, Nikunj A Dadhania wrote: > >> Load 8byte at a time and manipulate. > >>=20 > >> Big-Endian Storage > >> +-------------+-------------+-------------+-------------+ > >> | 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF | > >> +-------------+-------------+-------------+-------------+ > >>=20 > >> Little-Endian Storage > >> +-------------+-------------+-------------+-------------+ > >> | 33 22 11 00 | 77 66 55 44 | BB AA 99 88 | FF EE DD CC | > >> +-------------+-------------+-------------+-------------+ > >>=20 > >> Vector load results in: > >> +-------------+-------------+-------------+-------------+ > >> | 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF | > >> +-------------+-------------+-------------+-------------+ > > > > Ok. I'm guessing from this that implementing those GPR<->VSR > > instructions showed that the earlier versions were endian-incorrect as > > I suspected. > > > > Have you verified that this new implementation is actually faster (or > > at least no slower) on LE than the original implementation with > > individual 32-bit stores? >=20 > Result of million lxvw4x, mfvsrd/mfvsrld and print >=20 > Without patch: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > [tcg_test]$ time ../qemu/ppc64le-linux-user/qemu-ppc64le -cpu POWER9 le_= lxvw4x >/dev/null > real 0m2.812s > user 0m2.792s > sys 0m0.020s > [tcg_test]$ >=20 > With patch: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > [tcg_test]$ time ../qemu/ppc64le-linux-user/qemu-ppc64le -cpu POWER9 le_= lxvw4x >/dev/null > real 0m2.801s > user 0m2.783s > sys 0m0.018s > [tcg_test]$ >=20 > Not much perceivable difference, is there a better way to benchmark? Not dramatically, that I can think of. A few tweaks you can make: * Increase the loop counter so the test simply runs for longer * Also run the test multiple times, so you can get an idea of how much the results vary from one run to another * Run the test on a system that's as idle of other activity as you can make it (at both host and guest level). For out purposes the user time is probably the meaningful thing here, and should show less variance than the system and real time. Note that it would be interesting to get these results for both a power and x86 host. In any case the results above are enough to convince me that the change isn't likely to be a significant regression. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --BEa57a89OpeoUzGD Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJX7JCrAAoJEGw4ysog2bOSIq0QAIrCadDzaBT6yMVey5jyuFQB RoqxTJxErbQ0wK79ro0A6fJwkTFwLuRuMDn1HOYY7kkIG8XS5JmxrgU5XHUqu9ut dgrzdqUKDi/Abydc8B8/TIayh9DApvbqFovVob7EJg8y9j3TuPqLIl9TKVASiUm6 T55A+cFxoic29ygDdEC3oCf4Ebjuf7cQVY8XKZXJmlLsvBJA6scr265Ys0WNna8C ZTDKheOX4k1Lz2UOTTc2mHmUqflbYgngdveYmz6mdmo/3DJ5IevRNAPorPEpwbSH jjzQu88OuGSEUZMIg/Xx7kQjIxEVgfGkx5yu4wlbEPxAECIm9be8Qzmm5D1GH0iU TO4nwq5nOk0dG3O8ywffvWUAlXYvutylNzYKpbRNtTzg9G53DGYje3m3/HOB8plb LO09KJ4Pakn5l5O+9mf3R0F/8HKyYdLd4Y2+0pfsl8gBS46L5v3NUC0bZoP+dlE3 ZECMeX2/Trp15pmOSzug3NsZ3AXhDX1w2xqml/ozXmUwt62AHFPOO/kDBvD21ZVe f0od/T9Ui6Dv6AMDxfIUUdwi/OrIhTqzpqnMYW5Qlng2AsuynAh3BOPqg/oe4aLd 97MkLxBknaSunUZVlsCW1omZWKFgVvJlYNccVnIENOXcx8FW//igIU7A6kpEpUNu k7p2PzQezOwRjaT6n8yd =O4fl -----END PGP SIGNATURE----- --BEa57a89OpeoUzGD--