From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 03 of 10 v2] xen: sched_credit: let the scheduler know about node-affinity Date: Thu, 20 Dec 2012 19:18:07 +0100 Message-ID: References: <06d2f322a6319d8ba212.1355944039@Solace> <50D3414D.8080901@eu.citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <50D3414D.8080901@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap Cc: Marcus Granado , Dan Magenheimer , Ian Campbell , Anil Madhavapeddy , Andrew Cooper , Juergen Gross , Ian Jackson , "xen-devel@lists.xen.org" , Jan Beulich , Daniel De Graaf , Matt Wilson List-Id: xen-devel@lists.xenproject.org On Thu, Dec 20, 2012 at 5:48 PM, George Dunlap wrote: > On 19/12/12 19:07, Dario Faggioli wrote: >> +static inline int >> +__csched_vcpu_should_migrate(int cpu, cpumask_t *mask, cpumask_t *idlers) >> +{ >> + /* >> + * Consent to migration if cpu is one of the idlers in the VCPU's >> + * affinity mask. In fact, if that is not the case, it just means it >> + * was some other CPU that was tickled and should hence come and pick >> + * VCPU up. Migrating it to cpu would only make things worse. >> + */ >> + return cpumask_test_cpu(cpu, idlers) && cpumask_test_cpu(cpu, mask); >> } > > And in any case, looking at the caller of csched_load_balance(), it > explicitly says to steal work if the next thing on the runqueue of cpu has a > priority of TS_OVER. That was chosen for a reason -- if you want to change > that, you should change it there at the top (and make a justification for > doing so), not deeply nested in a function like this. > > Or am I completely missing something? > No, you're right. Trying to solve a nasty issue I was seeing, I overlooked I was changing the underlying logic until that point... Thanks! What I want to avoid is the following: a vcpu wakes-up on the busy pcpu Y. As a consequence, the idle pcpu X is tickled. Then, for any unrelated reason, pcpu Z reschedules and, as it would go idle too, it looks around for any vcpu to steal, finds one in Y's runqueue and grabs it. Afterward, when X gets the IPI and schedules, it just does not find anyone to run and goes back idling. Now, suppose the vcpu has X, but *not* Z, in its node-affinity (while it has a full vcpu-affinity, i.e., can run everywhere). In this case, a vcpu that could have run on a pcpu in its node-affinity, executes outside from it. That happens because, the NODE_BALANCE_STEP in csched_load_balance(), when called by Z, won't find anything suitable to steal (provided there actually isn't any vcpu waiting in any runqueue with node-affinity with Z), while the CPU_BALANCE_STEP will find our vcpu. :-( So, what I wanted is something that could tell me whether the pcpu which is stealing work is the one that has actually been tickled to do so. I was then using the pcpu idleness as a (cheap and easy to check) indication of that, but I now see this is having side effects I in the first place did not want to cause. Sorry for that, I probably spent so much time buried, as you where saying, in the various nested loops and calls, that I lost the context a little bit! :-P Ok, I think the problem I was describing is real, and I've seen it happening and causing performances degradation. However, as I think a good solution is going to be more complex than I thought, I'd better repost without this function and deal with it in a future separate patch (after having figured out the best way of doing so). Is that fine with you? > These changes all look right. > At least. :-) > But then, I'm a bit tired, so I'll give it > another once-over tomorrow. :-) > I can imagine, looking forward to your next comments. Thanks a lot and Regards, Dario -- <> (Raistlin Majere) --------------------------------------------------------------------------------------------------- Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)