From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: [PATCH RESEND 05/12] xen: numa-sched: make space
 for per-vcpu node-affinity
Date: Wed, 6 Nov 2013 11:44:40 +0000
Message-ID: <527A2BA8.9060601@eu.citrix.com>
References: <20131105142844.30446.78671.stgit@Solace>	
	<20131105143500.30446.9976.stgit@Solace>	
	<5279143702000078000FFB15@nat28.tlf.novell.com>	
	<527908B2.5090208@eu.citrix.com> <52790A93.4020903@eu.citrix.com>	
	<52791B8702000078000FFBC4@nat28.tlf.novell.com>	
	<5279114B.9080405@eu.citrix.com> <52792326.4050206@eu.citrix.com>	
	<527927E3.3000004@eu.citrix.com> <1383730770.9207.93.camel@Solace>	
	<527A1DFC0200007800100033@nat28.tlf.novell.com>
	<1383732000.9207.99.camel@Solace>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <George.Dunlap@citrix.com>) id 1Ve1XT-0002mI-FT
	for xen-devel@lists.xenproject.org; Wed, 06 Nov 2013 11:44:47 +0000
In-Reply-To: <1383732000.9207.99.camel@Solace>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Dario Faggioli <dario.faggioli@citrix.com>, Jan Beulich <JBeulich@suse.com>
Cc: MarcusGranado <Marcus.Granado@eu.citrix.com>, Justin Weaver <jtweaver@hawaii.edu>, IanCampbell <Ian.Campbell@citrix.com>, Li Yechen <lccycc123@gmail.com>, Andrew Cooper <Andrew.Cooper3@citrix.com>, JuergenGross <juergen.gross@ts.fujitsu.com>, Ian Jackson <Ian.Jackson@eu.citrix.com>, Matt Wilson <msw@amazon.com>, xen-devel <xen-devel@lists.xenproject.org>, Daniel De Graaf <dgdegra@tycho.nsa.gov>, KeirFraser <keir@xen.org>, Elena Ufimtseva <ufimtseva@gmail.com>
List-Id: xen-devel@lists.xenproject.org

On 06/11/13 10:00, Dario Faggioli wrote:
> On mer, 2013-11-06 at 09:46 +0000, Jan Beulich wrote:
>>>>> On 06.11.13 at 10:39, Dario Faggioli <dario.faggioli@citrix.com> wrote:
>>> Now, we're talking about killing vc->cpu_affinity and not introducing
>>> vc->node_affinity and, instead, introduce vc->cpu_hard_affinity and
>>> vc->cpu_soft_affinity and, more important, not to link any of the above
>>> to d->node_affinity. That means all the above operations _will_NOT_
>>> automatically affect d->node_affinity any longer, at least from the
>>> hypervisor (and, most likely, libxc) perspective. OTOH, I'm almost sure
>>> that I can force libxl (and xl) to retain the exact same behaviour it is
>>> exposing to the user (just by adding an extra call when needed).
>>>
>>> So, although all this won't be an issue for xl and libxl consumers (or,
>>> at least, that's my goal), it will change how the hypervisor used to
>>> behave in all those situations. This means that xl and libxl users will
>>> see no change, while folks issuing hypercalls and/or libxc calls will.
>>>
>>> Is that ok? I mean, I know there are no stability concerns for those
>>> APIs, but still, is that an acceptable change?
>> I would think that as long as d->node_affinity is empty, it should
>> still be set based on all vCPU-s' hard affinities
>>
> I see, and that sounds sensible to me... It's mostly a matter a matter
> of deciding whether o not we want something like that, and, if yes,
> whether we want it based on hard of soft.
>
> Personally, I think I agree with you on having it based on hard
> affinities by default.
>
> Let's see if George get to say something before I get to that part of
> the (re)implementation. :-)

I would probably have it based on soft affinities, since that's where we 
expect to have the domain's vcpus actually running most if the time; but 
it's really a bike-shed issue, and something we can change / adjust in 
the future.

(Although I suppose ideal behavior would be for the allocator to have 
three levels of preference instead of just two: allocate from soft 
affinity first; if that's not available, allocate from hard affinity; 
and finally allocate wherever you can find ram.  But that's probably 
more work than it's worth at this point.)

So what's the plan now Dario?  You're going to re-spin the patch series 
to just do hard and soft affinities at the HV level, plumbing the 
results through the toolstack?

I think for now I might advise putting off doing a NUMA interface at the 
libxl level, and do a full vNUMA interface in another series (perhaps 
for 4.5, depending on the timing).

  -George