From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH v2 10/16] xen: sched: use soft-affinity instead of domain's node-affinity Date: Thu, 14 Nov 2013 15:30:33 +0000 Message-ID: <5284EC99.3070607@eu.citrix.com> References: <20131113190852.18086.5437.stgit@Solace> <20131113191233.18086.60472.stgit@Solace> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20131113191233.18086.60472.stgit@Solace> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , xen-devel@lists.xen.org Cc: Marcus Granado , Keir Fraser , Ian Campbell , Li Yechen , Andrew Cooper , Juergen Gross , Ian Jackson , Jan Beulich , Justin Weaver , Matt Wilson , Elena Ufimtseva List-Id: xen-devel@lists.xenproject.org On 13/11/13 19:12, Dario Faggioli wrote: > now that we have it, use soft affinity for scheduling, and replace > the indirect use of the domain's NUMA node-affinity. This is > more general, as soft affinity does not have to be related to NUMA. > At the same time it allows to achieve the same results as > NUMA-aware scheduling, just by making soft affinity equal to the > domain's node affinity, for all the vCPUs (e.g., from the toolstack). > > This also means renaming most of the NUMA-aware scheduling related > functions, in credit1, to something more generic, hinting toward > the concept of soft affinity rather than directly to NUMA awareness. > > As a side effects, this simplifies the code quit a bit. In fact, > prior to this change, we needed to cache the translation of > d->node_affinity (which is a nodemask_t) to a cpumask_t, since that > is what scheduling decisions require (we used to keep it in > node_affinity_cpumask). This, and all the complicated logic > required to keep it updated, is not necessary any longer. > > The high level description of NUMA placement and scheduling in > docs/misc/xl-numa-placement.markdown is being updated too, to match > the new architecture. > > signed-off-by: Dario Faggioli Reviewed-by: George Dunlap Just a few things to note below... > diff --git a/xen/common/domain.c b/xen/common/domain.c > index 4b8fca8..b599223 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -411,8 +411,6 @@ void domain_update_node_affinity(struct domain *d) > node_set(node, d->node_affinity); > } > > - sched_set_node_affinity(d, &d->node_affinity); > - > spin_unlock(&d->node_affinity_lock); At this point, the only thing inside the spinlock is contingent on d->auto_node_affinity. > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c > index 398b095..0790ebb 100644 > --- a/xen/common/sched_credit.c > +++ b/xen/common/sched_credit.c ... > -static inline int __vcpu_has_node_affinity(const struct vcpu *vc, > +static inline int __vcpu_has_soft_affinity(const struct vcpu *vc, > const cpumask_t *mask) > { > - const struct domain *d = vc->domain; > - const struct csched_dom *sdom = CSCHED_DOM(d); > - > - if ( d->auto_node_affinity > - || cpumask_full(sdom->node_affinity_cpumask) > - || !cpumask_intersects(sdom->node_affinity_cpumask, mask) ) > + if ( cpumask_full(vc->cpu_soft_affinity) > + || !cpumask_intersects(vc->cpu_soft_affinity, mask) ) > return 0; At this point we've lost a way to make this check potentially much faster (being able to check auto_node_affinity). This isn't a super-hot path but it does happen fairly frequently -- will the "cpumask_full()" check take a significant amount of time on, say, a 4096-core system? If so, we might think about "caching" the results of cpumask_full() at some point.