From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest Date: Mon, 20 Jul 2015 16:09:14 +0200 Message-ID: <1437401354.5036.19.camel@citrix.com> References: <1437042762.28251.18.camel@citrix.com> <55A7A7F40200007800091D60@mail.emea.novell.com> <55A78DF2.1060709@citrix.com> <20150716152513.GU12455@zion.uk.xensource.com> <55A7D17C.5060602@citrix.com> <55A7D2CC.1050708@oracle.com> <55A7F7F40200007800092152@mail.emea.novell.com> <55A7DE45.4040804@citrix.com> <55A7E2D8.3040203@oracle.com> <55A8B83802000078000924AE@mail.emea.novell.com> <1437118075.23656.25.camel@citrix.com> <55A946C6.8000002@oracle.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2511823294157940272==" Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZHBl5-0002m3-0J for xen-devel@lists.xenproject.org; Mon, 20 Jul 2015 14:09:31 +0000 In-Reply-To: <55A946C6.8000002@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Boris Ostrovsky Cc: Elena Ufimtseva , Wei Liu , Andrew Cooper , David Vrabel , Jan Beulich , "xen-devel@lists.xenproject.org" List-Id: xen-devel@lists.xenproject.org --===============2511823294157940272== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-aR7fpVdGsEudjEi9zubP" --=-aR7fpVdGsEudjEi9zubP Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2015-07-17 at 14:17 -0400, Boris Ostrovsky wrote: > On 07/17/2015 03:27 AM, Dario Faggioli wrote: > > In the meanwhile, what should we do? Document this? How? "don't use > > vNUMA with PV guest in SMT enabled systems" seems a bit harsh... Is > > there a workaround we can put in place/suggest? >=20 > I haven't been able to reproduce this on my Intel box because I think I= =20 > have different core enumeration.=20 > Yes, most likely, that's highly topology dependant. :-( > Can you try adding > cpuid=3D['0x1:ebx=3Dxxxxxxxx00000001xxxxxxxxxxxxxxxx'] > to your config file? >=20 Done (sorry for the delay, the testbox was busy doing other stuff). Still no joy (.101 is the IP address of the guest, domain id 3): root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &" root@Zhaman:~# xl vcpu-list 3 Name ID VCPU CPU State Time(s) Affinity= (Hard / Soft) test 3 0 4 r-- 23.6 all / 0-= 7 test 3 1 9 r-- 19.8 all / 0-= 7 test 3 2 8 -b- 0.4 all / 8-= 15 test 3 3 4 -b- 0.2 all / 8-= 15 *HOWEVER* it seems to have an effect. In fact, now, topology as it is shown in /sys/... is different: root@test:~# cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list= =20 0 (it was 0-1) This, OTOH, is still the same: root@test:~# cat /sys/devices/system/cpu/cpu0/topology/core_siblings_list = =20 0-3 Also, I now see this: [ 0.150560] ------------[ cut here ]------------ [ 0.150560] WARNING: CPU: 2 PID: 0 at ../arch/x86/kernel/smpboot.c:317 t= opology_sane.isra.2+0x74/0x88() [ 0.150560] sched: CPU #2's llc-sibling CPU #0 is not on the same node! = [node: 1 !=3D 0]. Ignoring dependency. [ 0.150560] Modules linked in: [ 0.150560] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0+ #1 [ 0.150560] 0000000000000009 ffff88001ee2fdd0 ffffffff81657c7b ffffffff= 810bbd2c [ 0.150560] ffff88001ee2fe20 ffff88001ee2fe10 ffffffff81081510 ffff8800= 1ee2fea0 [ 0.150560] ffffffff8103aa02 ffff88003ea0a001 0000000000000000 ffff8800= 1f20a040 [ 0.150560] Call Trace: [ 0.150560] [] dump_stack+0x4f/0x7b [ 0.150560] [] ? up+0x39/0x3e [ 0.150560] [] warn_slowpath_common+0xa1/0xbb [ 0.150560] [] ? topology_sane.isra.2+0x74/0x88 [ 0.150560] [] warn_slowpath_fmt+0x46/0x48 [ 0.150560] [] ? __cpuid.constprop.0+0x15/0x19 [ 0.150560] [] topology_sane.isra.2+0x74/0x88 [ 0.150560] [] set_cpu_sibling_map+0x27a/0x444 [ 0.150560] [] ? numa_add_cpu+0x98/0x9f [ 0.150560] [] cpu_bringup+0x63/0xa8 [ 0.150560] [] cpu_bringup_and_idle+0xe/0x1a [ 0.150560] ---[ end trace 63d204896cce9f68 ]--- Notice that it now says 'llc-sibling', while, before, it was saying 'smt-sibling'. > On AMD, BTW, we fail a different test so some other bits probably need= =20 > to be tweaked. You may fail it too (the LLC sanity check). >=20 Yep, that's the one I guess. Should I try something more/else? Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-aR7fpVdGsEudjEi9zubP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlWtAQoACgkQk4XaBE3IOsTASwCfWKFGrhMdczJ3/nc6VnoSZRGv yr4AoJhQTKXnb1AF/ds1hsJfk1c6Xgu5 =zJBC -----END PGP SIGNATURE----- --=-aR7fpVdGsEudjEi9zubP-- --===============2511823294157940272== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============2511823294157940272==--