xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Anil Madhavapeddy <anil@recoil.org>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Jan Beulich <JBeulich@suse.com>,
	Daniel De Graaf <dgdegra@tycho.nsa.gov>,
	Matt Wilson <msw@amazon.com>
Subject: Re: [PATCH 03 of 10 v2] xen: sched_credit: let the scheduler know about node-affinity
Date: Thu, 20 Dec 2012 19:18:07 +0100	[thread overview]
Message-ID: <CAAWQectVEihrayJj5n4SPGqA0QJSiC7s2x_oDW=KHyxukWpMSA@mail.gmail.com> (raw)
In-Reply-To: <50D3414D.8080901@eu.citrix.com>

On Thu, Dec 20, 2012 at 5:48 PM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> On 19/12/12 19:07, Dario Faggioli wrote:
>> +static inline int
>> +__csched_vcpu_should_migrate(int cpu, cpumask_t *mask, cpumask_t *idlers)
>> +{
>> +    /*
>> +     * Consent to migration if cpu is one of the idlers in the VCPU's
>> +     * affinity mask. In fact, if that is not the case, it just means it
>> +     * was some other CPU that was tickled and should hence come and pick
>> +     * VCPU up. Migrating it to cpu would only make things worse.
>> +     */
>> +    return cpumask_test_cpu(cpu, idlers) && cpumask_test_cpu(cpu, mask);
>>   }
>
> And in any case, looking at the caller of csched_load_balance(), it
> explicitly says to steal work if the next thing on the runqueue of cpu has a
> priority of TS_OVER.  That was chosen for a reason -- if you want to change
> that, you should change it there at the top (and make a justification for
> doing so), not deeply nested in a function like this.
>
> Or am I completely missing something?
>
No, you're right. Trying to solve a nasty issue I was seeing, I overlooked I was
changing the underlying logic until that point... Thanks!

What I want to avoid is the following: a vcpu wakes-up on the busy pcpu Y. As
a consequence, the idle pcpu X is tickled. Then, for any unrelated reason, pcpu
Z reschedules and, as it would go idle too, it looks around for any
vcpu to steal,
finds one in Y's runqueue and grabs it. Afterward, when X gets the IPI and
schedules, it just does not find anyone to run and goes back idling.

Now, suppose the vcpu has X, but *not* Z, in its node-affinity (while
it has a full
vcpu-affinity, i.e., can run everywhere). In this case, a vcpu that
could have run on
a pcpu in its node-affinity, executes outside from it. That happens because,
the NODE_BALANCE_STEP in csched_load_balance(), when called by Z, won't
find anything suitable to steal (provided there actually isn't any
vcpu waiting in
any runqueue with node-affinity with Z), while the CPU_BALANCE_STEP will
find our vcpu. :-(

So, what I wanted is something that could tell me whether the pcpu which is
stealing work is the one that has actually been tickled to do so. I
was then using
the pcpu idleness as a (cheap and easy to check) indication of that,
but I now see
this is having side effects I in the first place did not want to cause.

Sorry for that, I probably spent so much time buried, as you where
saying, in the
various nested loops and calls, that I lost the context a little bit! :-P

Ok, I think the problem I was describing is real, and I've seen it happening and
causing performances degradation. However, as I think a good solution
is going to
be more complex than I thought, I'd better repost without this
function and deal with
it in a future separate patch (after having figured out the best way
of doing so). Is
that fine with you?

> These changes all look right.
>
At least. :-)

> But then, I'm a bit tired, so I'll give it
> another once-over tomorrow. :-)
>
I can imagine, looking forward to your next comments.

Thanks a lot and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
---------------------------------------------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

  reply	other threads:[~2012-12-20 18:18 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-19 19:07 [PATCH 00 of 10 v2] NUMA aware credit scheduling Dario Faggioli
2012-12-19 19:07 ` [PATCH 01 of 10 v2] xen, libxc: rename xenctl_cpumap to xenctl_bitmap Dario Faggioli
2012-12-20  9:17   ` Jan Beulich
2012-12-20  9:35     ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 02 of 10 v2] xen, libxc: introduce node maps and masks Dario Faggioli
2012-12-20  9:18   ` Jan Beulich
2012-12-20  9:55     ` Dario Faggioli
2012-12-20 14:33     ` George Dunlap
2012-12-20 14:52       ` Jan Beulich
2012-12-20 15:13         ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 03 of 10 v2] xen: sched_credit: let the scheduler know about node-affinity Dario Faggioli
2012-12-20  6:44   ` Juergen Gross
2012-12-20  8:16     ` Dario Faggioli
2012-12-20  8:25       ` Juergen Gross
2012-12-20  8:33         ` Dario Faggioli
2012-12-20  8:39           ` Juergen Gross
2012-12-20  8:58             ` Dario Faggioli
2012-12-20 15:28             ` George Dunlap
2012-12-20 16:00               ` Dario Faggioli
2012-12-20  9:22           ` Jan Beulich
2012-12-20 15:56   ` George Dunlap
2012-12-20 17:12     ` Dario Faggioli
2012-12-20 16:48   ` George Dunlap
2012-12-20 18:18     ` Dario Faggioli [this message]
2012-12-21 14:29       ` George Dunlap
2012-12-21 16:07         ` Dario Faggioli
2012-12-20 20:21   ` George Dunlap
2012-12-21  0:18     ` Dario Faggioli
2012-12-21 14:56       ` George Dunlap
2012-12-21 16:13         ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 04 of 10 v2] xen: allow for explicitly specifying node-affinity Dario Faggioli
2012-12-21 15:17   ` George Dunlap
2012-12-21 16:17     ` Dario Faggioli
2013-01-03 16:05     ` Daniel De Graaf
2012-12-19 19:07 ` [PATCH 05 of 10 v2] libxc: " Dario Faggioli
2012-12-21 15:19   ` George Dunlap
2012-12-21 16:27     ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 06 of 10 v2] libxl: " Dario Faggioli
2012-12-21 15:30   ` George Dunlap
2012-12-21 16:18     ` Dario Faggioli
2012-12-21 17:02       ` Ian Jackson
2012-12-21 17:09         ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 07 of 10 v2] libxl: optimize the calculation of how many VCPUs can run on a candidate Dario Faggioli
2012-12-20  8:41   ` Ian Campbell
2012-12-20  9:24     ` Dario Faggioli
2012-12-21 16:00   ` George Dunlap
2012-12-21 16:23     ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 08 of 10 v2] libxl: automatic placement deals with node-affinity Dario Faggioli
2012-12-21 16:22   ` George Dunlap
2012-12-19 19:07 ` [PATCH 09 of 10 v2] xl: add node-affinity to the output of `xl list` Dario Faggioli
2012-12-21 16:34   ` George Dunlap
2012-12-21 16:54     ` Dario Faggioli
2012-12-19 19:07 ` [PATCH 10 of 10 v2] docs: rearrange and update NUMA placement documentation Dario Faggioli
2012-12-19 23:16 ` [PATCH 00 of 10 v2] NUMA aware credit scheduling Dario Faggioli
2013-01-11 12:19 ` Ian Campbell
2013-01-11 13:57   ` Dario Faggioli
2013-01-11 14:09     ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAWQectVEihrayJj5n4SPGqA0QJSiC7s2x_oDW=KHyxukWpMSA@mail.gmail.com' \
    --to=dario.faggioli@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcus.Granado@eu.citrix.com \
    --cc=anil@recoil.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=george.dunlap@eu.citrix.com \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=msw@amazon.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).