From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest Date: Fri, 24 Jul 2015 18:48:19 +0200 Message-ID: <55B26C53.9000607@suse.com> References: <1437042762.28251.18.camel@citrix.com> <55A7A7F40200007800091D60@mail.emea.novell.com> <55A78DF2.1060709@citrix.com> <20150716152513.GU12455@zion.uk.xensource.com> <55A7D17C.5060602@citrix.com> <55A7D2CC.1050708@oracle.com> <55A7F7F40200007800092152@mail.emea.novell.com> <55A7DE45.4040804@citrix.com> <55A7E2D8.3040203@oracle.com> <55A8B83802000078000924AE@mail.emea.novell.com> <1437118075.23656.25.camel@citrix.com> <55A946C6.8000002@oracle.com> <1437401354.5036.19.camel@citrix.com> <55AD08F7.7020105@oracle.com> <55AEA4DD.7080406@oracle.com> <1437572160.5036.39.camel@citrix.com> <55AF9F8F.7030200@suse.com> <55AFA16B.3070103@oracle.com> <55AFA41E.1080101@suse.com> <55AFAC34.1060606@oracle.com> <55B070ED.2040200@suse.com> <1437660433.5036.96.camel@citrix.com> <55B21364.5040906@suse.com> <1437749076.4682.47.camel@citrix.com> <55B25650.4030402@suse.com> <55B258C9.4040400@suse.com> <1437753509.4682.78.camel@citrix.com> <55B26377.4060807@suse.com> <55B26A92.8060004@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZIg91-0004TY-ND for xen-devel@lists.xenproject.org; Fri, 24 Jul 2015 16:48:23 +0000 In-Reply-To: <55B26A92.8060004@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Boris Ostrovsky , Dario Faggioli Cc: Elena Ufimtseva , Wei Liu , Andrew Cooper , David Vrabel , Jan Beulich , "xen-devel@lists.xenproject.org" List-Id: xen-devel@lists.xenproject.org On 07/24/2015 06:40 PM, Boris Ostrovsky wrote: > On 07/24/2015 12:10 PM, Juergen Gross wrote: >> >> If we can fiddle with the masks on boot, we could do it in a running >> system, too. Another advantage with not relying on cpuid. :-) > > > I am trying to catch up with this thread so I may have missed it, but I > still don't understand why we don't want to rely on CPUID. > > I think I saw Juergen said --- because it's HW-specific. But what's > wrong with that? Hypervisor is building virtualized x86 (in this case) > hardware and on such HW CPUID is the standard way of determining > thread/core topology. Plus various ACPI tables and such. > > And having a solution that doesn't address userspace (when there *is* a > solution that can do it) doesn't seem like the best approach. Yes, it > still won't cover userspace for PV guests but neither will the kernel > patch. > > As far as licensing is concerned --- are we sure this can't also be > addressed by CPUID? BTW, if I was asked about who is most concerned > about licensing my first answer would be --- databases. I.e. userspace. The problem is to construct cpuids which will enable the linux scheduler to work correct in spite of the hypervisor scheduler moving vcpus between pcpus. The only way to do this is to emulate single-threaded cores on the numa nodes without further grouping. So either single-core sockets or one socket with many cores. This might be problematic for licensing: the multi-socket solution might require a higher license based on socket numbers. Or the license is based on cores and will be more expensive as no hyperthreads are detectable. > (Also, I don't know whether this is still true but in the past APICID > format was also used for topology discovery. Just to make things a bit > more interesting ;-)) Another +1 for the pv-solution. :-) Juergen