From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 07 of 10 [RFC]] sched_credit: Let the scheduler know about `node affinity` Date: Wed, 02 May 2012 17:13:06 +0200 Message-ID: <1335971586.2961.60.camel@Abyss> References: <1f4b55806de9e7109ff6.1334150274@Solace> <4F9AB0F8.10102@eu.citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6757133835761434034==" Return-path: In-Reply-To: <4F9AB0F8.10102@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap Cc: Andre Przywara , Ian Campbell , Stefano Stabellini , Juergen Gross , Ian Jackson , "xen-devel@lists.xen.org" , Jan Beulich List-Id: xen-devel@lists.xenproject.org --===============6757133835761434034== Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-3n48E9qaqULvGKFci6NW" --=-3n48E9qaqULvGKFci6NW Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2012-04-27 at 15:45 +0100, George Dunlap wrote:=20 > Hey Dario, >=20 Hi! > Sorry for the long delay in reviewing this. >=20 No problem, thanks to you for taking the time to look at the patches so thoroughly! > Overall I think the approach is good. =20 > That's nice to hear. :-) > > /* > > + * Node Balancing > > + */ > > +#define CSCHED_BALANCE_NODE_AFFINITY 1 > > +#define CSCHED_BALANCE_CPU_AFFINITY 0 > > +#define CSCHED_BALANCE_START_STEP CSCHED_BALANCE_NODE_AFFINITY > > +#define CSCHED_BALANCE_END_STEP CSCHED_BALANCE_CPU_AFFINITY > > + > > + > This thing of defining "START_STEP" and "END_STEP" seems a bit fragile. = =20 > I think it would be better to always start at 0, and go until=20 > CSCHED_BALANCE_MAX. > Ok, I agree that this is fragile and probably also a bit overkill. I'll make it simpler as you're suggesting. > > + /* > > + * Let's cache domain's dom->node_affinity here as an > > + * optimization for a couple of hot paths. In fact, > > + * knowing whether or not dom->node_affinity has changed > > + * would allow us to avoid rebuilding node_affinity_cpumask > > + * (below) duing node balancing and/or scheduling. > > + */ > > + nodemask_t node_affinity_cache; > > + /* Basing on what dom->node_affinity says, > > + * on what CPUs would we like to run most? */ > > + cpumask_t node_affinity_cpumask; > I think the comments here need to be more clear. The main points are: > * node_affinity_cpumask is the dom->node_affinity translated from a=20 > nodemask into a cpumask > * Because doing the nodemask -> cpumask translation may be expensive,=20 > node_affinity_cache stores the last translated value, so we can avoid=20 > doing the translation if nothing has changed. >=20 Ok, will do. > > +/* > > + * Sort-of conversion between node-affinity and vcpu-affinity for the = domain, > > + * i.e., a cpumask containing all the cpus from all the set nodes in t= he > > + * node-affinity mask of the domain. > This needs to be clearer -- vcpu-affinity doesn't have anything to do=20 > with this function, and there's nothing "sort-of" about the conversion. := -) >=20 > I think you mean to say, "Create a cpumask from the node affinity mask." > Exactly, I'll try to clarify. > > static inline void > > +__cpumask_tickle(cpumask_t *mask, const cpumask_t *idle_mask) > > +{ > > + CSCHED_STAT_CRANK(tickle_idlers_some); > > + if ( opt_tickle_one_idle ) > > + { > > + this_cpu(last_tickle_cpu) =3D > > + cpumask_cycle(this_cpu(last_tickle_cpu), idle_mask); > > + cpumask_set_cpu(this_cpu(last_tickle_cpu), mask); > > + } > > + else > > + cpumask_or(mask, mask, idle_mask); > > +} > I don't see any reason to make this into a function -- it's only called= =20 > once, and it's not that long. Unless you're concerned about too many=20 > indentations making the lines too short? > That was part of it, but I'll put it back and see if I can have it looking good enough. :-) > > sdom->dom =3D dom; > > + /* > > + *XXX This would be 'The Right Thing', but as it is still too > > + * early and d->node_affinity has not settled yet, maybe we > > + * can just init the two masks with something like all-nodes > > + * and all-cpus and rely on the first balancing call for > > + * having them updated? > > + */ > > + csched_build_balance_cpumask(sdom); > We might as well do what you've got here, unless it's likely to produce= =20 > garbage. This isn't exactly a hot path. :-) >=20 Well, I won't call it garbage, so that's probably fine. I was concerned about be thing be clear and meaningful enough. Having this here could make us (and or future coders :-D) think that they can rely on the balance mask somehow, while that is not entirely true. I'll re-check the code and see how I can make it something better for next posting... > Other than that, looks good -- Thanks! >=20 Good to know, thanks to you! Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-3n48E9qaqULvGKFci6NW Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEABECAAYFAk+hTwIACgkQk4XaBE3IOsRNNQCfUIlnas4XtpqnUqw2WEOvRW/I Rf8An3EXgE+Pbp3A8g6SYbb887q5kviV =1Ao3 -----END PGP SIGNATURE----- --=-3n48E9qaqULvGKFci6NW-- --===============6757133835761434034== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============6757133835761434034==--