From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [v7 PATCH 02/10] xen: sched: introduce soft-affinity and use it instead d->node-affinity Date: Tue, 10 Jun 2014 12:26:45 +0100 Message-ID: <5396EB75.4070101@eu.citrix.com> References: <20140610002959.16660.44334.stgit@Solace> <20140610004434.16660.22776.stgit@Solace> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20140610004434.16660.22776.stgit@Solace> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , xen-devel@lists.xen.org Cc: keir@xen.org, Ian.Campbell@citrix.com, Andrew.Cooper3@citrix.com, George.Dunlap@citrix.com, JBeulich@suse.com, Ian.Jackson@citrix.com List-Id: xen-devel@lists.xenproject.org On 06/10/2014 01:44 AM, Dario Faggioli wrote: > Before this change, each vcpu had its own vcpu-affinity > (in v->cpu_affinity), representing the set of pcpus where > the vcpu is allowed to run. Since when NUMA-aware scheduling > was introduced the (credit1 only, for now) scheduler also > tries as much as it can to run all the vcpus of a domain > on one of the nodes that constitutes the domain's > node-affinity. > > The idea here is making the mechanism more general by: > * allowing for this 'preference' for some pcpus/nodes to be > expressed on a per-vcpu basis, instead than for the domain > as a whole. That is to say, each vcpu should have its own > set of preferred pcpus/nodes, instead than it being the > very same for all the vcpus of the domain; > * generalizing the idea of 'preferred pcpus' to not only NUMA > awareness and support. That is to say, independently from > it being or not (mostly) useful on NUMA systems, it should > be possible to specify, for each vcpu, a set of pcpus where > it prefers to run (in addition, and possibly unrelated to, > the set of pcpus where it is allowed to run). > > We will be calling this set of *preferred* pcpus the vcpu's > soft affinity, and this changes introduce it, and starts using it > for scheduling, replacing the indirect use of the domain's NUMA > node-affinity. This is more general, as soft affinity does not > have to be related to NUMA. Nevertheless, it allows to achieve the > same results of NUMA-aware scheduling, just by making soft affinity > equal to the domain's node affinity, for all the vCPUs (e.g., > from the toolstack). > > This also means renaming most of the NUMA-aware scheduling related > functions, in credit1, to something more generic, hinting toward > the concept of soft affinity rather than directly to NUMA awareness. > > As a side effects, this simplifies the code quit a bit. In fact, > prior to this change, we needed to cache the translation of > d->node_affinity (which is a nodemask_t) to a cpumask_t, since that > is what scheduling decisions require (we used to keep it in > node_affinity_cpumask). This, and all the complicated logic > required to keep it updated, is not necessary any longer. > > The high level description of NUMA placement and scheduling in > docs/misc/xl-numa-placement.markdown is being updated too, to match > the new architecture. > > Signed-off-by: Dario Faggioli > Reviewed-by: George Dunlap Probably should have taken this off; but in any case: Reviewed-by: George Dunlap