From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: Virt overehead with HT [was: Re: Xen 4.5 development update] Date: Tue, 15 Jul 2014 00:44:10 +0200 Message-ID: <1405377850.5333.17.camel@Solace> References: <20140701164347.61662A7843@laptop.dumpdata.com> <1405354372.29306.687.camel@Solace> <53C4062A.3040403@bobich.net> <1405356283.7341.5.camel@Abyss> <53C40B91.7080006@eu.citrix.com> <1405358537.7341.19.camel@Abyss> <53C421F4.9070501@bobich.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7398246119542911315==" Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1X6oyv-000777-A7 for xen-devel@lists.xenproject.org; Mon, 14 Jul 2014 22:44:25 +0000 In-Reply-To: <53C421F4.9070501@bobich.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Gordan Bobic Cc: Lars Kurth , George Dunlap , George Dunlap , Ross Lagerwall , "stefano.stabellini@citrix.com" , "xen-devel@lists.xenproject.org" List-Id: xen-devel@lists.xenproject.org --===============7398246119542911315== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-dr4FgQkc196NAUVpXKnY" --=-dr4FgQkc196NAUVpXKnY Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On lun, 2014-07-14 at 19:31 +0100, Gordan Bobic wrote: > On 07/14/2014 06:22 PM, Dario Faggioli wrote: > > I'll try more runs, e.g. with number of VCPUs equal less than > > nr_corse/2 and see what happens. > > > > Again, thoughts? >=20 > Have you tried it with VCPUs pinned to appropriate PCPUs? >=20 Define "appropriate". I have a run for which I pinned VCPU#1-->PCPU#1, VCPU#2-->PCPU#2, and so on, and the result is even worse: Average Half load -j 4 Run (std deviation): Elapsed Time 37.808 (0.538999) Average Optimal load -j 8 Run (std deviation): Elapsed Time 26.594 (0.235223) Average Maximal load -j Run (std deviation): Elapsed Time 27.9 (0.131149) This is actually something I expected, since you do not allow the VCPUs to move away from an HT with a busy sibling, even when it could have. In fact, you may expect better result from pinning only if you were to pin not only the VCPUs to the PCPUs, but also the kernbench's build jobs on the appropriate (V)CPUs in the guest.. but that's something not only really unpractical, but also very few representative as a benchmark, I think. If you pin VCPU#1 to PCPU#1 and VCPU#2 to PCPU#2, with PCPU#1 and PCPU#2 being HT siblings, what prevents Linux (in the guest) to run two of the four build jobs on VCPU#1 and VCPU#2 (i.e., on siblings PCPUs!!) for all the length of the benchmark? Nothing, I think. And in fact, pinning would also result in good (near to native, perhaps?) performance, if we were exposing the SMT topology details to guests as, in that case, Linux would do the balancing properly. However, that's not the case either. :-( But, perhaps, you were referring to a different pinning strategy? Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-dr4FgQkc196NAUVpXKnY Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlPEXToACgkQk4XaBE3IOsTiMgCgquXjG7Kq/qoBEKcHDUhRD4kE lnwAn1xLm2icWrqoCHWratWckbEjWD6T =Z8Xu -----END PGP SIGNATURE----- --=-dr4FgQkc196NAUVpXKnY-- --===============7398246119542911315== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============7398246119542911315==--