From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH RESEND 05/12] xen: numa-sched: make space for per-vcpu node-affinity Date: Wed, 6 Nov 2013 11:44:40 +0000 Message-ID: <527A2BA8.9060601@eu.citrix.com> References: <20131105142844.30446.78671.stgit@Solace> <20131105143500.30446.9976.stgit@Solace> <5279143702000078000FFB15@nat28.tlf.novell.com> <527908B2.5090208@eu.citrix.com> <52790A93.4020903@eu.citrix.com> <52791B8702000078000FFBC4@nat28.tlf.novell.com> <5279114B.9080405@eu.citrix.com> <52792326.4050206@eu.citrix.com> <527927E3.3000004@eu.citrix.com> <1383730770.9207.93.camel@Solace> <527A1DFC0200007800100033@nat28.tlf.novell.com> <1383732000.9207.99.camel@Solace> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Ve1XT-0002mI-FT for xen-devel@lists.xenproject.org; Wed, 06 Nov 2013 11:44:47 +0000 In-Reply-To: <1383732000.9207.99.camel@Solace> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , Jan Beulich Cc: MarcusGranado , Justin Weaver , IanCampbell , Li Yechen , Andrew Cooper , JuergenGross , Ian Jackson , Matt Wilson , xen-devel , Daniel De Graaf , KeirFraser , Elena Ufimtseva List-Id: xen-devel@lists.xenproject.org On 06/11/13 10:00, Dario Faggioli wrote: > On mer, 2013-11-06 at 09:46 +0000, Jan Beulich wrote: >>>>> On 06.11.13 at 10:39, Dario Faggioli wrote: >>> Now, we're talking about killing vc->cpu_affinity and not introducing >>> vc->node_affinity and, instead, introduce vc->cpu_hard_affinity and >>> vc->cpu_soft_affinity and, more important, not to link any of the above >>> to d->node_affinity. That means all the above operations _will_NOT_ >>> automatically affect d->node_affinity any longer, at least from the >>> hypervisor (and, most likely, libxc) perspective. OTOH, I'm almost sure >>> that I can force libxl (and xl) to retain the exact same behaviour it is >>> exposing to the user (just by adding an extra call when needed). >>> >>> So, although all this won't be an issue for xl and libxl consumers (or, >>> at least, that's my goal), it will change how the hypervisor used to >>> behave in all those situations. This means that xl and libxl users will >>> see no change, while folks issuing hypercalls and/or libxc calls will. >>> >>> Is that ok? I mean, I know there are no stability concerns for those >>> APIs, but still, is that an acceptable change? >> I would think that as long as d->node_affinity is empty, it should >> still be set based on all vCPU-s' hard affinities >> > I see, and that sounds sensible to me... It's mostly a matter a matter > of deciding whether o not we want something like that, and, if yes, > whether we want it based on hard of soft. > > Personally, I think I agree with you on having it based on hard > affinities by default. > > Let's see if George get to say something before I get to that part of > the (re)implementation. :-) I would probably have it based on soft affinities, since that's where we expect to have the domain's vcpus actually running most if the time; but it's really a bike-shed issue, and something we can change / adjust in the future. (Although I suppose ideal behavior would be for the allocator to have three levels of preference instead of just two: allocate from soft affinity first; if that's not available, allocate from hard affinity; and finally allocate wherever you can find ram. But that's probably more work than it's worth at this point.) So what's the plan now Dario? You're going to re-spin the patch series to just do hard and soft affinities at the HV level, plumbing the results through the toolstack? I think for now I might advise putting off doing a NUMA interface at the libxl level, and do a full vNUMA interface in another series (perhaps for 4.5, depending on the timing). -George