From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest Date: Wed, 22 Jul 2015 16:50:45 +0200 Message-ID: <1437576645.5036.56.camel@citrix.com> References: <1437042762.28251.18.camel@citrix.com> <55A7A7F40200007800091D60@mail.emea.novell.com> <55A78DF2.1060709@citrix.com> <20150716152513.GU12455@zion.uk.xensource.com> <55A7D17C.5060602@citrix.com> <55A7D2CC.1050708@oracle.com> <55A7F7F40200007800092152@mail.emea.novell.com> <55A7DE45.4040804@citrix.com> <55A7E2D8.3040203@oracle.com> <55A8B83802000078000924AE@mail.emea.novell.com> <1437118075.23656.25.camel@citrix.com> <55A946C6.8000002@oracle.com> <1437401354.5036.19.camel@citrix.com> <55AD08F7.7020105@oracle.com> <55AEA4DD.7080406@oracle.com> <1437572160.5036.39.camel@citrix.com> <55AF9F8F.7030200@suse.com> <55AFA16B.3070103@oracle.com> <55AFA41E.1080101@suse.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2860029413115338156==" Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZHvMI-0004lN-DB for xen-devel@lists.xenproject.org; Wed, 22 Jul 2015 14:50:58 +0000 In-Reply-To: <55AFA41E.1080101@suse.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Juergen Gross Cc: Elena Ufimtseva , Wei Liu , Andrew Cooper , David Vrabel , Jan Beulich , "xen-devel@lists.xenproject.org" , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org --===============2860029413115338156== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-y7+cVhAqHLeZ7VwKf7Y/" --=-y7+cVhAqHLeZ7VwKf7Y/ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2015-07-22 at 16:09 +0200, Juergen Gross wrote: > On 07/22/2015 03:58 PM, Boris Ostrovsky wrote: > > What if I configure a guest to follow HW topology? I.e. I pin VCPUs to > > appropriate cores/threads? With elfnote I am stuck with disabled topolo= gy. >=20 > Add an option to do exactly that: follow HW topology (pin vcpus, > configure vnuma)? >=20 I thought about configuring things in such a way that they match the host topology, as Boris is suggesting, too. And in that case, I think arranging for doing so in toolstack, if PV vNUMA is identified (as I think Juergen is suggesting) seems a good approach. However, when I try to do that on my box, manually, but I don't seem to be able to. Here's what I tried. Since I have this host topology: cpu_topology : cpu: core socket node 0: 0 1 0 1: 0 1 0 2: 1 1 0 3: 1 1 0 4: 9 1 0 5: 9 1 0 6: 10 1 0 7: 10 1 0 8: 0 0 1 9: 0 0 1 10: 1 0 1 11: 1 0 1 12: 9 0 1 13: 9 0 1 14: 10 0 1 15: 10 0 1 I configured the guest like this: vcpus =3D '4' memory =3D '1024' vnuma =3D [ [ "pnode=3D0","size=3D512","vcpus=3D0-1","vdistances=3D10,20" = ], [ "pnode=3D1","size=3D512","vcpus=3D2-3","vdistances=3D20,10" ] = ] cpus=3D["0","1","8","9"] This means vcpus 0 and 1, which are assigned to vnode 0, are pinned to pcpu 0 and 1, which are siblings, per the host topology. Similarly, vcpus 2 and 3, assigned to vnode 1, are assigned to two siblings pcpus on pnode 1. This seems to be honoured: # xl vcpu-list 4 Name ID VCPU CPU State Time(s) Affinity= (Hard / Soft) test 4 0 0 -b- 10.9 0 / 0-7 test 4 1 1 -b- 7.6 1 / 0-7 test 4 2 8 -b- 0.1 8 / 8-15 test 4 3 9 -b- 0.1 9 / 8-15 And yet, no joy: # ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" # ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" # ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" # ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" # xl vcpu-list 4 Name ID VCPU CPU State Time(s) Affinity= (Hard / Soft) test 4 0 0 r-- 16.4 0 / 0-7 test 4 1 1 r-- 12.5 1 / 0-7 test 4 2 8 -b- 0.2 8 / 8-15 test 4 3 9 -b- 0.1 9 / 8-15 So, what am I doing wrong at "following the hw topology"? > > Besides, this is not necessarily a NUMA-only issue, it's a scheduling > > one (inside the guest) as well. >=20 > Sure. That's what Jan said regarding SUSE's xen-kernel. No toplogy info > (or a trivial one) might be better than the wrong one... >=20 Yep. Exacty. As Boris says, this is a generic scheduling issue, although it's tru that it's only (as far as I can tell) with vNUMA that it bite us so hard... I mean, performance are always going to be inconsistent, but it's only in that case that you basically _loose_ some of the vcpus! :-O Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-y7+cVhAqHLeZ7VwKf7Y/ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlWvrcUACgkQk4XaBE3IOsRs7QCgmoLqhvvfczUxms8gNUVLMHx7 Q28An0mJh3gm7EbMe2KPcCt6YFJvZQIX =eQW8 -----END PGP SIGNATURE----- --=-y7+cVhAqHLeZ7VwKf7Y/-- --===============2860029413115338156== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============2860029413115338156==--