From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest Date: Tue, 28 Jul 2015 00:19:09 +0100 Message-ID: <55B6BC6D.8020808@citrix.com> References: <1437042762.28251.18.camel@citrix.com> <55B64A8A.7040200@citrix.com> <1438012950.5036.215.camel@citrix.com> <55B65CD6.7000607@citrix.com> <55B65D77.1050202@citrix.com> <1438018925.5036.242.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4752888665161562622==" Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZJrfy-0000kA-QF for xen-devel@lists.xenproject.org; Mon, 27 Jul 2015 23:19:19 +0000 In-Reply-To: <1438018925.5036.242.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli Cc: Elena Ufimtseva , Wei Liu , David Vrabel , Jan Beulich , "xen-devel@lists.xenproject.org" , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============4752888665161562622== Content-Type: multipart/alternative; boundary="------------080500000703040709080408" This is a multi-part message in MIME format. --------------080500000703040709080408 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 27/07/2015 18:42, Dario Faggioli wrote: > On Mon, 2015-07-27 at 17:33 +0100, Andrew Cooper wrote: >> On 27/07/15 17:31, David Vrabel wrote: >>> >>>> Yeah, indeed. That's the downside of Juergen's "Linux scheduler >>>> approach". But the issue is there, even without taking vNUMA into >>>> account, and I think something like that would really help (only for >>>> Dom0, and Linux guests, of course). >>> I disagree. Whether we're using vNUMA or not, Xen should still ensure >>> that the guest kernel and userspace see a consistent and correct >>> topology using the native mechanisms. >> >> +1 >> > +1 from me as well. In fact, a mechanism for making exactly such thing > happen, was what I was after when starting the thread. > > Then it came up that CPUID needs to be used for at least two different > and potentially conflicting purposes, that we want to support both and > that, whether and for whatever reason it's used, Linux configures its > scheduler after it, potentially resulting in rather pathological setups. I don't see what the problem is here. Fundamentally, "NUMA optimise" vs "comply with licence" is a user/admin decision at boot time, and we need not cater to both halves at the same time. Supporting either, as chosen by the admin, is worthwhile. > > > It's at that point that some decoupling started to appear > interesting... :-P > > Also, are we really being consistent? If my methodology is correct > (which might not be, please, double check, and sorry for that), I'm > seeing quite some inconsistency around: > > HOST: > root@Zhaman:~# xl info -n > ... > cpu_topology : > cpu: core socket node > 0: 0 1 0 > 1: 0 1 0 > 2: 1 1 0 > 3: 1 1 0 > 4: 9 1 0 > 5: 9 1 0 > 6: 10 1 0 > 7: 10 1 0 > 8: 0 0 1 > 9: 0 0 1 > 10: 1 0 1 > 11: 1 0 1 > 12: 9 0 1 > 13: 9 0 1 > 14: 10 0 1 > 15: 10 0 1 o_O What kind of system results in this layout? Can you dump the ACPI tables and make them available? > > ... > root@Zhaman:~# xl vcpu-list test > Name ID VCPU CPU State Time(s) Affinity (Hard / Soft) > test 2 0 0 r-- 1.5 0 / all > test 2 1 1 r-- 0.2 1 / all > test 2 2 8 -b- 2.2 8 / all > test 2 3 9 -b- 2.0 9 / all > > GUEST (HVM, 4 vcpus): > root@test:~# cpuid|grep CORE_ID > (APIC synth): PKG_ID=0 CORE_ID=16 SMT_ID=0 > (APIC synth): PKG_ID=0 CORE_ID=16 SMT_ID=1 > (APIC synth): PKG_ID=0 CORE_ID=0 SMT_ID=0 > (APIC synth): PKG_ID=0 CORE_ID=0 SMT_ID=1 > > HOST: > root@Zhaman:~# xl vcpu-pin 2 all 0 > root@Zhaman:~# xl vcpu-list 2 > Name ID VCPU CPU State Time(s) Affinity (Hard / Soft) > test 2 0 0 -b- 43.7 0 / all > test 2 1 0 -b- 38.4 0 / all > test 2 2 0 -b- 36.9 0 / all > test 2 3 0 -b- 38.8 0 / all > > GUEST: > root@test:~# cpuid|grep CORE_ID > (APIC synth): PKG_ID=0 CORE_ID=16 SMT_ID=0 > (APIC synth): PKG_ID=0 CORE_ID=16 SMT_ID=0 > (APIC synth): PKG_ID=0 CORE_ID=16 SMT_ID=0 > (APIC synth): PKG_ID=0 CORE_ID=16 SMT_ID=0 > > HOST: > root@Zhaman:~# xl vcpu-pin 2 0 7 > root@Zhaman:~# xl vcpu-pin 2 1 7 > root@Zhaman:~# xl vcpu-pin 2 2 15 > root@Zhaman:~# xl vcpu-pin 2 3 15 > root@Zhaman:~# xl vcpu-list 2 > Name ID VCPU CPU State Time(s) Affinity (Hard / Soft) > test 2 0 7 -b- 44.3 7 / all > test 2 1 7 -b- 38.9 7 / all > test 2 2 15 -b- 37.3 15 / all > test 2 3 15 -b- 39.2 15 / all > > GUEST: > root@test:~# cpuid|grep CORE_ID > (APIC synth): PKG_ID=0 CORE_ID=26 SMT_ID=1 > (APIC synth): PKG_ID=0 CORE_ID=26 SMT_ID=1 > (APIC synth): PKG_ID=0 CORE_ID=10 SMT_ID=1 > (APIC synth): PKG_ID=0 CORE_ID=10 SMT_ID=1 > > So, it looks to me that: > 1) any application using CPUID for either licensing or > placement/performance optimization will get (potentially) random > results; > 2) whatever set of values the kernel used, during guest boot, to build > up its internal scheduling data structures, has no guarantee of > being related to any value returned by CPUID, at a later point. > > Hence, I think I'm seeing inconsistency between kernel and userspace > (and between userspace and itself, over time) already... Am I > overlooking something? All current CPUID values presented to guests are about as reliable as being picked from /dev/urandom. (This isn't strictly true - the feature flags will be in the right ballpark if the VM has not migrated yet). Fixing this (as described in my feature levelling design document) is sufficiently non-trivial that it has been deferred to post feature-levelling work. ~Andrew --------------080500000703040709080408 Content-Type: text/html; charset=utf-8 Content-Length: 12320 Content-Transfer-Encoding: quoted-printable On 27/07/2015 18:42, Dario Faggioli wrote:
> On Mon, 2015-07-27 at 17:33 +0100, Andrew Cooper wrote: >> On 27/07/15 17:31, David Vrabel wrote: >>> >>>> Yeah, indeed. That's the downside of Juergen's "Linux scheduler >>>> approach". But the issue is there, even without taking vNUMA into >>>> account, and I think something like that would really help (only for >>>> Dom0, and Linux guests, of course). >>> I disagree.=C2=A0 Whether we're using vNUMA or not, Xen should still ensure >>> that the guest kernel and userspace see a consistent and correct >>> topology using the native mechanisms. >> >> +1 >> > +1 from me as well. In fact, a mechanism for making exactly such thing > happen, was what I was after when starting the thread. > > Then it came up that CPUID needs to be used for at least two different > and potentially conflicting purposes, that we want to support both and > that, whether and for whatever reason it's used, Linux configures its > scheduler after it, potentially resulting in rather pathological setups.

I don't see what the problem is here.=C2=A0 Fundamentally, "NUMA optimise" vs "comply with licence" is a user/admin decision at boot time, and we need not cater to both halves at the same time.

Supporting either, as chosen by the admin, is worthwhile.

> > > It's at that point that some decoupling started to appear > interesting... :-P > > Also, are we really being consistent=3F If my methodology is correct > (which might not be, please, double check, and sorry for that), I'm > seeing quite some inconsistency around: > > HOST: >=C2=A0 root@Zhaman:~# xl info -n >=C2=A0 ... >=C2=A0 cpu_topology=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 : >=C2=A0 cpu:=C2=A0=C2=A0=C2=A0 core=C2=A0=C2=A0=C2=A0 socket=C2=A0=C2=A0=C2=A0=C2=A0 node >=C2=A0=C2=A0=C2=A0 0:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 1:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 2:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 3:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 4:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 9=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 5:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 9=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 6:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 10=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 7:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 10=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 >=C2=A0=C2=A0=C2=A0 8:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0=C2=A0 9:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0 10:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0 11:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0 12:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 9=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0 13:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 9=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0 14:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 10=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1 >=C2=A0=C2=A0 15:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 10=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1

o_O

What kind of system results in this layout=3F=C2=A0 Can you dump the ACPI tables and make them available=3F

> >=C2=A0 ... >=C2=A0 root@Zhaman:~# xl vcpu-list test >=C2=A0 Name=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ID=C2=A0 VCPU=C2=A0=C2=A0 CPU State=C2=A0=C2=A0 Time(s) Affinity (Hard / Soft) >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 r--=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1.5=C2=A0 0 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0 r--=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0.2=C2=A0 1 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2.2=C2=A0 8 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 3=C2=A0=C2=A0=C2=A0 9=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2.0=C2=A0 9 / all > > GUEST (HVM, 4 vcpus): >=C2=A0 root@test:~# cpuid|grep CORE_ID >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D16 SMT_ID=3D0 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D16 SMT_ID=3D1 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D0 SMT_ID=3D0 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D0 SMT_ID=3D1 > > HOST: >=C2=A0 root@Zhaman:~# xl vcpu-pin 2 all 0 >=C2=A0 root@Zhaman:~# xl vcpu-list 2 >=C2=A0 Name=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ID=C2=A0 VCPU=C2=A0=C2=A0 CPU State=C2=A0=C2=A0 Time(s) Affinity (Hard / Soft) >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 43.7=C2=A0 0 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 38.4=C2=A0 0 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 36.9=C2=A0 0 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 3=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 38.8=C2=A0 0 / all > > GUEST: >=C2=A0 root@test:~# cpuid|grep CORE_ID >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D16 SMT_ID=3D0 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D16 SMT_ID=3D0 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D16 SMT_ID=3D0 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D16 SMT_ID=3D0 > > HOST: >=C2=A0 root@Zhaman:~# xl vcpu-pin 2 0 7 >=C2=A0 root@Zhaman:~# xl vcpu-pin 2 1 7 >=C2=A0 root@Zhaman:~# xl vcpu-pin 2 2 15 >=C2=A0 root@Zhaman:~# xl vcpu-pin 2 3 15 >=C2=A0 root@Zhaman:~# xl vcpu-list 2 >=C2=A0 Name=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ID=C2=A0 VCPU=C2=A0=C2=A0 CPU State=C2=A0=C2=A0 Time(s) Affinity (Hard / Soft) >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 7=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 44.3=C2=A0 7 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0 7=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 38.9=C2=A0 7 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0 15=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 37.3=C2=A0 15 / all >=C2=A0 test=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0 3=C2=A0=C2=A0 15=C2=A0=C2=A0 -b-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 39.2=C2=A0 15 / all > > GUEST: >=C2=A0 root@test:~# cpuid|grep CORE_ID >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D26 SMT_ID=3D1 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D26 SMT_ID=3D1 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D10 SMT_ID=3D1 >=C2=A0=C2=A0=C2=A0 (APIC synth): PKG_ID=3D0 CORE_ID=3D10 SMT_ID=3D1 > > So, it looks to me that: >=C2=A0 1) any application using CPUID for either licensing or >=C2=A0=C2=A0=C2=A0=C2=A0 placement/performance optimization will get (potentially) random >=C2=A0=C2=A0=C2=A0=C2=A0 results; >=C2=A0 2) whatever set of values the kernel used, during guest boot, to build >=C2=A0=C2=A0=C2=A0=C2=A0 up its internal scheduling data structures, has no guarantee of >=C2=A0=C2=A0=C2=A0=C2=A0 being related to any value returned by CPUID, at a later point. > > Hence, I think I'm seeing inconsistency between kernel and userspace > (and between userspace and itself, over time) already... Am I > overlooking something=3F

All current CPUID values presented to guests are about as reliable as being picked from /dev/urandom.=C2=A0 (This isn't strictly true - the feature flags will be in the right ballpark if the VM has not migrated yet).

Fixing this (as described in my feature levelling design document) is sufficiently non-trivial that it has been deferred to post feature-levelling work.

~Andrew

--------------080500000703040709080408-- --===============4752888665161562622== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4752888665161562622==--