From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest Date: Mon, 27 Jul 2015 16:02:39 +0200 Message-ID: <55B639FF.40609@suse.com> References: <55AFAC34.1060606@oracle.com> <55B070ED.2040200@suse.com> <1437660433.5036.96.camel@citrix.com> <55B21364.5040906@suse.com> <1437749076.4682.47.camel@citrix.com> <55B25650.4030402@suse.com> <55B258C9.4040400@suse.com> <1437753509.4682.78.camel@citrix.com> <20150724160948.GA2067@l.oracle.com> <55B26570.1060008@suse.com> <20150724162911.GC2220@l.oracle.com> <55B26A45.2050402@suse.com> <55B26B84.1000101@oracle.com> <55B5B504.2030504@suse.com> <55B60DE7.1020300@suse.com> <55B611F1.80508@citrix.com> <55B61DA5.5030903@suse.com> <1438003395.5036.122.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZJizV-0001O0-TR for xen-devel@lists.xenproject.org; Mon, 27 Jul 2015 14:02:54 +0000 In-Reply-To: <1438003395.5036.122.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli Cc: Elena Ufimtseva , Wei Liu , George Dunlap , Andrew Cooper , George Dunlap , David Vrabel , Jan Beulich , "xen-devel@lists.xenproject.org" , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org On 07/27/2015 03:23 PM, Dario Faggioli wrote: > On Mon, 2015-07-27 at 14:01 +0200, Juergen Gross wrote: >> On 07/27/2015 01:11 PM, George Dunlap wrote: > >>> Or alternately, if the user wants to give up on the "consolidation" >>> aspect of virtualization, they can pin vcpus to pcpus and then pass in >>> the actual host topology (hyperthreads and all). >> >> There would be another solution, of course: >> >> Support hyperthreads in the Xen scheduler via gang scheduling. While >> this is not a simple solution, it is a fair one. Hyperthreads on one >> core can influence each other rather much. With both threads always >> running vcpus of the same guest the penalty/advantage would stay in the >> same domain. The guest could make really sensible scheduling decisions >> and the licensing would still work as desired. >> > This is interesting indeed, but I much rather see it as something > orthogonal, which may indeed bring benefits in some of the scenarios > described here, but should not be considered *the* solution. Correct. I still think it should be done. > Implementing, enabling and asking users to use something like this will > impact the system behavior and performance, in ways that may not be > desirable for all use cases. I'd make it a scheduler parameter. So you could it enable for a specific cpupool where you want it to be active. > So, while I do think that this may be something nice to have and offer, > trying to use it for solving the problem we're debating here would make > things even more complex to configure. > > Also, this would take care of HT related issues, but what about cores > (as in 'should vcpus be cores of sockets or full sockets') and !HT boxes > (like AMD)? !HT boxes will have no problem: We won't have to hide HT as cores... Regarding many sockets with 1 core each or 1 socket with many cores: I think 1 socket for the non-NUMA case is okay, we'll want multiple sockets for NUMA. > Not to mention, as you say yourself, that it's not easy to implement. Yeah, but it will be fun. ;-) > >> Just an idea, but maybe worth to explore further instead of tweaking >> more and more bits to make the virtual system somehow act sane. >> > Sure, and it it's interesting indeed, for a bunch or reasons and > purposes (as Tim is also noting). Not so much --or at least not > necessarily-- for this one, IMO. It's especially interesting regarding accounting. A vcpu running for 1 second can do much more if no other vcpu is running on the same core. This would be a problem of the guest then, like on bare metal. For real time purposes it might be even interesting to schedule only 1 vcpu per core to have a reliable high speed of the vcpu. Juergen