xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Anil Madhavapeddy <anil@recoil.org>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Xen-Devel <xen-devel@lists.xen.org>, Matt Wilson <msw@amazon.com>,
	Daniel De Graaf <dgdegra@tycho.nsa.gov>
Subject: Re: [PATCH 03 of 11 v4] xen: sched_credit: when picking, make sure we get an idle one, if any
Date: Fri, 15 Mar 2013 11:37:54 +0100	[thread overview]
Message-ID: <1363343874.3912.21.camel@Solace> (raw)
In-Reply-To: <5142E68102000078000C5CEE@nat28.tlf.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 3704 bytes --]

On ven, 2013-03-15 at 08:14 +0000, Jan Beulich wrote:
> >>> On 15.03.13 at 03:30, Dario Faggioli <dario.faggioli@citrix.com> wrote:
> > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> > --- a/xen/common/sched_credit.c
> > +++ b/xen/common/sched_credit.c
> > @@ -532,6 +532,18 @@ static int
> >      if ( vc->processor == cpu && IS_RUNQ_IDLE(cpu) )
> >          cpumask_set_cpu(cpu, &idlers);
> >      cpumask_and(&cpus, &cpus, &idlers);
> > +
> > +    /*
> > +     * It is important that cpu points to an idle processor, if a suitable
> > +     * one exists (and we can use cpus to check and, possibly, choose a new
> > +     * CPU, as we just &&-ed it with idlers). In fact, if we are on SMT, and
> > +     * cpu points to a busy thread with an idle sibling, both the threads
> > +     * will be considered the same, from the "idleness" calculation point
> > +     * of view", preventing vcpu from being moved to the thread that is
> > +     * actually idle.
> > +     */
> > +    if ( !cpumask_empty(&cpus) && !cpumask_test_cpu(cpu, &cpus) )
> 
> I think I had asked about this before - 
>
Did you? I don't remember anything like that but, if you did and I did
not answer, sorry for that! :-P

> what's the point of the
> left hand side of the &&? If the mask is empty, the right hand
> side will cover that quite well, at much less a price for high
> NR_CPUS (or nr_cpu_ids).
> 
And in fact it was not there, but ISTR having to add it because not
having it was leading in some very bad Xen crash... If I remember
correctly, this is what happens without it.

The code looks like this: 

    if ( !cpumask_empty(&cpus) && !cpumask_test_cpu(cpu, &cpus) )
        cpu = cpumask_cycle(cpu, &cpus);
    cpumask_clear_cpu(cpu, &cpus);

    while ( !cpumask_empty(&cpus) ) {
        ...
    }
    ...
    return cpu;

So, what happens if cpus is actually empty? As you say
cpumask_test_cpu(cpu,&cpus) will be false, which means cpu is updated
with the result of cpumask(cycle(cpu,&cpus)). If I'm reading the code
correctly, a cpumask_cycle() on an empty cpumask will give me
nr_cpu_ids, which is then what is returned (the while loop is not
entered, so nothing more happens to cpu), which makes things explode...
Does that make sense?

Time has passed since I saw that bugtrace, so it is possible that my
memories are not accurate... I surely can try to reproduce it, if you
want to see the "smoking gun" :-)

Perhaps I can turn the condition into something like this:

    if ( !cpumask_test_cpu(cpu, &cpus) )
        cpu = cpumask_empty(&cpus) ? cpu : cpumask_cycle(cpu, &cpus);

So that we pay the price less frequently?

> The ASSERT() a few lines earlier
> could be simplified in similar ways, btw.
> 
You mean this, right?

    online = cpupool_scheduler_cpumask(vc->domain->cpupool);
    cpumask_and(&cpus, online, vc->cpu_affinity);
    cpu = cpumask_test_cpu(vc->processor, &cpus)
            ? vc->processor
            : cpumask_cycle(vc->processor, &cpus);
    ASSERT( !cpumask_empty(&cpus) && cpumask_test_cpu(cpu, &cpus) );

Not sure. AFAIU the code, the ASSERT() is indeed willing to make sure
that cpus did not ended up being empty as a consequence of the
cpumask_and(), and that is done together with the cpumask_test_cpu()
just to have only one ASSERT() instead of two, but again, I might well
be wrong.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2013-03-15 10:37 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-15  2:30 [PATCH 00 of 11 v4] NUMA aware credit scheduling Dario Faggioli
2013-03-15  2:30 ` [PATCH 01 of 11 v4] xen, libxc: rename xenctl_cpumap to xenctl_bitmap Dario Faggioli
2013-03-15  2:30 ` [PATCH 02 of 11 v4] xen, libxc: introduce xc_nodemap_t Dario Faggioli
2013-03-15  2:30 ` [PATCH 03 of 11 v4] xen: sched_credit: when picking, make sure we get an idle one, if any Dario Faggioli
2013-03-15  8:14   ` Jan Beulich
2013-03-15 10:37     ` Dario Faggioli [this message]
2013-03-15 10:55       ` Jan Beulich
2013-03-18 14:02         ` George Dunlap
2013-03-18 14:23           ` Dario Faggioli
2013-03-15  2:30 ` [PATCH 04 of 11 v4] xen: sched_credit: let the scheduler know about node-affinity Dario Faggioli
2013-03-18 13:58   ` George Dunlap
2013-03-15  2:30 ` [PATCH 05 of 11 v4] xen: allow for explicitly specifying node-affinity Dario Faggioli
2013-03-15  8:17   ` Jan Beulich
2013-03-15 14:20   ` Daniel De Graaf
2013-03-16  7:11     ` Dario Faggioli
2013-03-15  2:30 ` [PATCH 06 of 11 v4] libxc: " Dario Faggioli
2013-03-15  2:30 ` [PATCH 07 of 11 v4] libxl: " Dario Faggioli
2013-03-18 14:33   ` Ian Campbell
2013-03-18 14:35     ` Dario Faggioli
2013-03-15  2:30 ` [PATCH 08 of 11 v4] libxl: optimize the calculation of how many VCPUs can run on a candidate Dario Faggioli
2013-03-18 14:34   ` Ian Campbell
2013-03-15  2:30 ` [PATCH 09 of 11 v4] libxl: automatic placement deals with node-affinity Dario Faggioli
2013-03-18 14:36   ` Ian Campbell
2013-03-15  2:30 ` [PATCH 10 of 11 v4] xl: add node-affinity to the output of `xl list` Dario Faggioli
2013-03-15  3:03   ` Dario Faggioli
2013-03-18 14:06   ` George Dunlap
2013-03-18 14:21     ` Dario Faggioli
2013-03-18 14:13   ` Ian Campbell
2013-03-18 14:22     ` Dario Faggioli
2013-03-15  2:30 ` [PATCH 11 of 11 v4] docs: rearrange and update NUMA placement documentation Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1363343874.3912.21.camel@Solace \
    --to=dario.faggioli@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcus.Granado@eu.citrix.com \
    --cc=anil@recoil.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=msw@amazon.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).