xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Anil Madhavapeddy <anil@recoil.org>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Jan Beulich <JBeulich@suse.com>,
	Daniel De Graaf <dgdegra@tycho.nsa.gov>,
	Matt Wilson <msw@amazon.com>
Subject: Re: [PATCH 04 of 11 v3] xen: sched_credit: let the scheduler know about node-affinity
Date: Wed, 13 Mar 2013 17:09:28 +0100	[thread overview]
Message-ID: <1363190968.3065.78.camel@Solace> (raw)
In-Reply-To: <CAFLBxZaoZoAHi-CLwWbiWr9xJO410UxO=JkTi=OgJdoOxckyMQ@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3897 bytes --]

Ok, here we are again! :-)

On mer, 2013-02-27 at 19:00 +0000, George Dunlap wrote:
> On Fri, Feb 1, 2013 at 11:01 AM, Dario Faggioli 

>         +
>         +/*
>         + * vcpu-affinity balancing is always necessary and must never
>         be skipped.
>         + * OTOH, if a domain has affinity with all the nodes, we can
>         tell the caller
>         + * that he can safely skip the node-affinity balancing step.
>         + */
>         +#define __vcpu_has_valuable_node_affinity(vc) \
>         +    ( !
>         cpumask_full(CSCHED_DOM(vc->domain)->node_affinity_cpumask) )
> 
> 
> Something about the name of this one just doesn't strike me right.  I
> might be tempted just to go with "__vcpu_has_node_affinity", and let
> the comment above it explain what it means for the curious.
> 
Yes, I deeply hate it too, but wasn't sure whether or not leaving the
"_valuable_" part out would have made things clear enough. Glad you
think it does, I'll kill it, while at the same time improve the comment,
as you suggest.

>         +
>         +static inline int csched_balance_step_skippable(int step,
>         struct vcpu *vc)
>         +{
>         +    if ( step == CSCHED_BALANCE_NODE_AFFINITY
>         +         && !__vcpu_has_valuable_node_affinity(vc) )
>         +        return 1;
>         +    return 0;
>         +}
> 
> 
> I'm not a fan of this kind of encapsulation, but I'll let it be I
> guess.  You missed a place to use it however -- in csched_runq_steal()
> you have the if() statement.
> 
Well, that was indeed on purpose, as in one of the places, I'm actually
skipping the balancing step as a whole, while in csched_runq_steal(),
I'm actually skipping only the various vCPUs, one after the other.

Anyway, the fact that you've seen it like this clearly demonstrates this
attempt of mine of being, let's say, precise, did not work... I'm happy
with killing the function above and put the if() there.


> Just an observation -- I think this will penalize systems that do not
> have node affinity enabled, in that if no one is using node
> affinities, then what will happen is the load balancing code will go
> through and check each vcpu on each processor, determine that there
> are no node affinities, and then go through again looking at vcpu
> affinities.  Going through twice for a single vcpu when doing
> placement is probably OK; but going all the way through all vcpus once
> could be pretty expensive.
> 
Well, yes, that is one of the downside of scanning the runqueue inthe
node-wise fashion we agreed upon during the review of one of the
previous rounds. Anyway...

> I think we should take this patch now (with the other minor changes
> mentioned above) so we can get it tested properly.  But we should
> probably try to do something to address this issue before the release
> -- maybe something like keeping a bitmask for each runqueue, saying
> whether any vcpus running on them have node affinity?  That way the
> first round we'll only check runqueues where we might conceivably
> steal something.
> 
...This sounds reasonable to me. I'm releasing v4 _without_ it, but with
a comment in the code. I'm already working on a solution along the line
of what you suggest, and I'm sure I can release the outcome later, as a
separate patch, if the series (as I hope! :-D) will be in at the time,
after having rerun all my usual benchmarks (which is something I can't
do right now).

> Other than that, looks good -- thanks for the good work.
> 
Thanks to you for bearing with it (and with me :-)).
Dario


-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2013-03-13 16:09 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-01 11:01 [PATCH 00 of 11 v3] NUMA aware credit scheduling Dario Faggioli
2013-02-01 11:01 ` [PATCH 01 of 11 v3] xen, libxc: rename xenctl_cpumap to xenctl_bitmap Dario Faggioli
2013-03-12 15:46   ` Ian Campbell
2013-03-12 17:06     ` Dario Faggioli
2013-03-12 17:26       ` Ian Campbell
2013-03-12 18:08         ` Dario Faggioli
2013-03-13  9:53           ` Ian Campbell
2013-03-13 10:13             ` Dario Faggioli
2013-02-01 11:01 ` [PATCH 02 of 11 v3] xen, libxc: introduce xc_nodemap_t Dario Faggioli
2013-02-01 11:01 ` [PATCH 03 of 11 v3] xen: sched_credit: when picking, make sure we get an idle one, if any Dario Faggioli
2013-02-01 13:57   ` Juergen Gross
2013-02-07 17:50   ` George Dunlap
2013-02-01 11:01 ` [PATCH 04 of 11 v3] xen: sched_credit: let the scheduler know about node-affinity Dario Faggioli
2013-02-01 14:30   ` Juergen Gross
2013-02-27 19:00   ` George Dunlap
2013-03-13 16:09     ` Dario Faggioli [this message]
2013-03-12 15:57   ` Ian Campbell
2013-03-12 16:20     ` Dario Faggioli
2013-03-12 16:30       ` Ian Campbell
2013-02-01 11:01 ` [PATCH 05 of 11 v3] xen: allow for explicitly specifying node-affinity Dario Faggioli
2013-02-01 14:20   ` Juergen Gross
2013-02-01 11:01 ` [PATCH 06 of 11 v3] libxc: " Dario Faggioli
2013-02-01 11:01 ` [PATCH 07 of 11 v3] libxl: " Dario Faggioli
2013-02-28 12:16   ` George Dunlap
2013-02-01 11:01 ` [PATCH 08 of 11 v3] libxl: optimize the calculation of how many VCPUs can run on a candidate Dario Faggioli
2013-02-01 14:28   ` Juergen Gross
2013-02-28 14:05   ` George Dunlap
2013-02-01 11:01 ` [PATCH 09 of 11 v3] libxl: automatic placement deals with node-affinity Dario Faggioli
2013-02-01 11:01 ` [PATCH 10 of 11 v3] xl: add node-affinity to the output of `xl list` Dario Faggioli
2013-03-12 16:04   ` Ian Campbell
2013-03-12 16:10     ` Dario Faggioli
2013-02-01 11:01 ` [PATCH 11 of 11 v3] docs: rearrange and update NUMA placement documentation Dario Faggioli
2013-02-01 13:41   ` Juergen Gross
2013-02-28 14:11   ` George Dunlap
2013-02-18 17:13 ` [PATCH 00 of 11 v3] NUMA aware credit scheduling Dario Faggioli
2013-02-19  8:11   ` Jan Beulich
2013-02-19  8:51     ` Ian Campbell
2013-02-21 13:54   ` George Dunlap
2013-02-21 14:32     ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1363190968.3065.78.camel@Solace \
    --to=dario.faggioli@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcus.Granado@eu.citrix.com \
    --cc=anil@recoil.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=msw@amazon.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).