From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v6 00/10] vnuma introduction Date: Tue, 22 Jul 2014 17:06:37 +0200 Message-ID: <1406041597.17850.74.camel@Solace> References: <1405662609-31486-1-git-send-email-ufimtseva@gmail.com> <20140718095359.GA5687@zion.uk.xensource.com> <1405678416.5333.179.camel@Solace> <20140718114834.GI7142@zion.uk.xensource.com> <1406037824.17850.46.camel@Solace> <20140722144846.GB6448@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8752390585327170573==" Return-path: In-Reply-To: <20140722144846.GB6448@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: keir@xen.org, Ian.Campbell@citrix.com, stefano.stabellini@eu.citrix.com, george.dunlap@eu.citrix.com, msw@linux.com, lccycc123@gmail.com, ian.jackson@eu.citrix.com, xen-devel@lists.xen.org, JBeulich@suse.com, Elena Ufimtseva List-Id: xen-devel@lists.xenproject.org --===============8752390585327170573== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-oJeTbhacFtWa/mnA2F3O" --=-oJeTbhacFtWa/mnA2F3O Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On mar, 2014-07-22 at 15:48 +0100, Wei Liu wrote: > On Tue, Jul 22, 2014 at 04:03:44PM +0200, Dario Faggioli wrote: > > I mean, even right now, PV guests see completely random cache-sharing > > topology, and that does (at least potentially) affect performance, as > > the guest scheduler will make incorrect/inconsistent assumptions. > >=20 >=20 > Correct. It's just that it might be more obvious to see the problem with > vNUMA. >=20 Yep. > > > Yes, given that you derive numa memory allocation from cpu pinning or > > > use combination of cpu pinning, vcpu to vnode map and vnode to pnode > > > map, in those cases those IDs might reflect the right topology. > > >=20 > > Well, pinning does (should?) not always happen, as a consequence of a > > virtual topology being used. > >=20 >=20 > That's true. I was just referring to the current status of the patch > series. AIUI that's how it is implemented now, not necessary the way it > has to be. >=20 Ok. > > With the following guest configuration, in terms of vcpu pinning: > >=20 > > 1) 2 vCPUs =3D=3D> same pCPUs >=20 > 4 vcpus, I think. >=20 > > root@benny:~# xl vcpu-list=20 > > Name ID VCPU CPU State Time(s) CPU = Affinity > > debian.guest.osstest 9 0 0 -b- 2.7 0 > > debian.guest.osstest 9 1 0 -b- 5.2 0 > > debian.guest.osstest 9 2 7 -b- 2.4 7 > > debian.guest.osstest 9 3 7 -b- 4.4 7 > >=20 What I meant with "2 vCPUs" was that I was putting 2 vCPUs of the guest (0 and 1) on the same pCPU (0), and the other 2 (2 and 3) on another (7). That should have meant a topology that does not share at least the least cache level in the guest, but it is not. > > 2) no SMT > > root@benny:~# xl vcpu-list=20 > > Name ID VCPU CPU State Time(s) CPU > > Affinity > > debian.guest.osstest 11 0 0 -b- 0.6 0 > > debian.guest.osstest 11 1 2 -b- 0.4 2 > > debian.guest.osstest 11 2 4 -b- 1.5 4 > > debian.guest.osstest 11 3 6 -b- 0.5 6 > >=20 > > 3) Random > > root@benny:~# xl vcpu-list=20 > > Name ID VCPU CPU State Time(s) CPU > > Affinity > > debian.guest.osstest 12 0 3 -b- 1.6 all > > debian.guest.osstest 12 1 1 -b- 1.4 all > > debian.guest.osstest 12 2 5 -b- 2.4 all > > debian.guest.osstest 12 3 7 -b- 1.5 all > >=20 > > 4) yes SMT > > root@benny:~# xl vcpu-list > > Name ID VCPU CPU State Time(s) CPU > > Affinity > > debian.guest.osstest 14 0 1 -b- 1.0 1 > > debian.guest.osstest 14 1 2 -b- 1.8 2 > > debian.guest.osstest 14 2 6 -b- 1.1 6 > > debian.guest.osstest 14 3 7 -b- 0.8 7 > >=20 > > And, in *all* these 4 cases, here's what I see: > >=20 > > root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/core_siblings_= list > > 0-3 > > 0-3 > > 0-3 > > 0-3 > >=20 > > root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/thread_sibling= s_list > > 0-3 > > 0-3 > > 0-3 > > 0-3 > >=20 > > root@debian:~# lstopo > > Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#= 0 (32KB) + Core L#0 > > PU L#0 (P#0) > > PU L#1 (P#1) > > PU L#2 (P#2) > > PU L#3 (P#3) > >=20 >=20 > I won't be surprised if guest builds up a wrong topology, as what real > "ID"s it sees depends very much on what pcpus you pick. >=20 Exactly, but if I pin all the guest vCPUs on specific host pCPUs from the very beginning (pinning specified in the config file, which is what I'm doing), I should be able to control that... > Have you tried pinning vcpus to pcpus [0, 1, 2, 3]? That way you should > be able to see the same topology as the one you saw in Dom0? >=20 Well, at least some of the examples above should have shown some non-shared cache levels already. Anyway, here it comes: root@benny:~# xl vcpu-list=20 Name ID VCPU CPU State Time(s) CPU Affi= nity debian.guest.osstest 15 0 0 -b- 1.8 0 debian.guest.osstest 15 1 1 -b- 0.7 1 debian.guest.osstest 15 2 2 -b- 0.6 2 debian.guest.osstest 15 3 3 -b- 0.7 3 root@debian:~# hwloc-ls --of console Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#1) PU L#2 (P#2) PU L#3 (P#3) root@debian:~# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 4 Core(s) per socket: 1 Socket(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 60 Stepping: 3 CPU MHz: 3591.780 BogoMIPS: 7183.56 Hypervisor vendor: Xen Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 8192K So, no, that is not giving the same result as in Dom0. :-( > > This is not the case for dom0 where (I booted with dom0_max_vcpus=3D4 o= n > > the xen command line) I see this: > >=20 >=20 > I guess this is because you're basically picking pcpu 0-3 for Dom0. It > doesn't matter if you pin them or not. >=20 That makes total sense, and in fact, I was not surprised about Dom0 looking like this... I rather am about not being able to get a similar topology for the guest, no matter how I pin it... :-/ Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-oJeTbhacFtWa/mnA2F3O Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlPOff0ACgkQk4XaBE3IOsQLUQCeO0Zm2Qj2+eqJhT2JYFnBy4bL heMAmQE7aJ25PVBypKeOs2KV4p2DfGoi =ZFzw -----END PGP SIGNATURE----- --=-oJeTbhacFtWa/mnA2F3O-- --===============8752390585327170573== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============8752390585327170573==--