From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest Date: Thu, 16 Jul 2015 11:50:36 -0400 Message-ID: <55A7D2CC.1050708@oracle.com> References: <1437042762.28251.18.camel@citrix.com> <55A7A7F40200007800091D60@mail.emea.novell.com> <55A78DF2.1060709@citrix.com> <20150716152513.GU12455@zion.uk.xensource.com> <55A7D17C.5060602@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZFlRC-0006cM-71 for xen-devel@lists.xenproject.org; Thu, 16 Jul 2015 15:51:06 +0000 In-Reply-To: <55A7D17C.5060602@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , Wei Liu Cc: Elena Ufimtseva , Dario Faggioli , David Vrabel , Jan Beulich , "xen-devel@lists.xenproject.org" List-Id: xen-devel@lists.xenproject.org On 07/16/2015 11:45 AM, Andrew Cooper wrote: > On 16/07/15 16:25, Wei Liu wrote: >> On Thu, Jul 16, 2015 at 11:56:50AM +0100, Andrew Cooper wrote: >>> On 16/07/15 11:47, Jan Beulich wrote: >>>>>>> On 16.07.15 at 12:32, wrote: >>>>> root@test:~# numactl --hardware >>>>> available: 2 nodes (0-1) >>>>> node 0 cpus: 0 1 >>>>> node 0 size: 475 MB >>>>> node 0 free: 382 MB >>>>> node 1 cpus: 2 3 >>>>> node 1 size: 495 MB >>>>> node 1 free: 475 MB >>>>> node distances: >>>>> node 0 1 >>>>> 0: 10 10 >>>>> 1: 20 10 >>>>> >>>>> root@test:~# cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list >>>>> 0-1 >>>>> root@test:~# cat /sys/devices/system/cpu/cpu0/topology/core_siblings_list >>>>> 0-3 >>>>> root@test:~# cat /sys/devices/system/cpu/cpu2/topology/thread_siblings_list >>>>> 2-3 >>>>> root@test:~# cat /sys/devices/system/cpu/cpu2/topology/core_siblings_list >>>>> 0-3 >>>>> >>>>> So the complain during boot seems to be against 'core_siblings' (which >>>>> was not what I expected, but perhaps I misremember the meaning of >>>>> "core_siblings" VS. "thread_siblings" VS. smt-siblings in Linux; I'll >>>>> double check). >>>>> >>>>> Anyway, is there anything we can do to fix or workaround things? >>>> Make the guest honor topology also at the CPUID layer. Whether >>>> that's by not wrongly consuming the respective CPUID bits (i.e. a >>>> guest side change) or reflecting PV state in what the hypervisor >>>> returns I'm not sure about. While the latter might be more clean, >>>> I'd be afraid this might get in the way of what the tool stack wants >>>> to see. >>> Xen's CPUID handling currently has no concept of per-core and >>> per-package data in the cpuid policy. The guest sees the information >> Can / Will Xen have that concept in the future? > It is certainly possible. > > I plan to try and lay some ground work as part of the feature levelling > fixes, but fixing the hypervisor representation of cpuid is specifically > out of scope for the feature levelling fixes (in a deliberate attempt to > prevent the project expanding to fill more time than I have). > > It is on my list of areas to tackle, but it is several nested cans of worms. Can't we set leaf 1's EBX[32:16] to 1? -boris