Re: About vcpu wakeup and runq tickling in credit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Dario Faggioli <raistlin@linux.it>
Cc: Keir Fraser <keir.xen@gmail.com>, Jan Beulich <JBeulich@suse.com>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: About vcpu wakeup and runq tickling in credit
Date: Tue, 23 Oct 2012 16:16:47 +0100	[thread overview]
Message-ID: <5086B4DF.6060701@eu.citrix.com> (raw)
In-Reply-To: <1350999260.5064.56.camel@Solace>

On 23/10/12 14:34, Dario Faggioli wrote:
> Hi George, Everyone,
>
> While reworking a bit my NUMA aware scheduling patches I figured I'm not
> sure I understand what __runq_tickle() (in xen/common/sched_credit.c, of
> course) does.
>
> Here's the thing. Upon every vcpu wakeup we put the new vcpu in a runq
> and then call __runq_tickle(), passing the waking vcpu via 'new'. Let's
> call the vcpu that just woke up v_W, and the vcpu that is currently
> running on the cpu where that happens v_C. Let's also call the CPU where
> all is happening P.
>
> As far as I've understood, in  __runq_tickle(), we:
>
>
> static inline void
> __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
> {
>      [...]
>      cpumask_t mask;
>
>      cpumask_clear(&mask);
>
>      /* If strictly higher priority than current VCPU, signal the CPU */
>      if ( new->pri > cur->pri )
>      {
>          [...]
>          cpumask_set_cpu(cpu, &mask);
>      }
>
> --> Make sure we put the CPU we are on (P) in 'mask', in case the woken
> --> vcpu (v_W) has higher priority that the currently running one (v_C).
>
>      /*
>       * If this CPU has at least two runnable VCPUs, we tickle any idlers to
>       * let them know there is runnable work in the system...
>       */
>      if ( cur->pri > CSCHED_PRI_IDLE )
>      {
>          if ( cpumask_empty(prv->idlers) )
>          [...]
>          else
>          {
>              cpumask_t idle_mask;
>
>              cpumask_and(&idle_mask, prv->idlers, new->vcpu->cpu_affinity);
>              if ( !cpumask_empty(&idle_mask) )
>              {
>                  [...]
>                  if ( opt_tickle_one_idle )
>                  {
>                      [...]
>                      cpumask_set_cpu(this_cpu(last_tickle_cpu), &mask);
>                  }
>                  else
>                      cpumask_or(&mask, &mask, &idle_mask);
>              }
>              cpumask_and(&mask, &mask, new->vcpu->cpu_affinity);
>
> --> Make sure we include one or more (depending on opt_tickle_one_idle)
> --> CPUs that are both idle and part of v_W's CPU-affinity in 'mask'.
>
>          }
>      }
>
>      /* Send scheduler interrupts to designated CPUs */
>      if ( !cpumask_empty(&mask) )
>          cpumask_raise_softirq(&mask, SCHEDULE_SOFTIRQ);
>
> --> Ask all the CPUs in 'mask' to reschedule. That would mean all the
> --> idlers from v_W's CPU-affinity and, possibly, "ourself" (P). The
> --> effect will be that all/some of the CPUs v_W's has affinity with
> --> _and_ (let's assume so) P will go through scheduling as quickly as
> --> possible.
>
> }
>
> Is the above right?

It looks right to me.

> If yes, here's my question. Is that right to always tickle v_W's affine
> CPUs and only them?
>
> I'm asking because a possible scenario, at least according to me, is
> that P schedules very quickly after this and, as prio(v_W)>prio(v_C), it
> selects v_W and leaves v_C in its runq. At that point, one of the
> tickled CPU (say P') enters schedule, sees that P is not idle, and tries
> to steal a vcpu from its runq. Now we know that P' has affinity with
> v_W, but v_W is not there, while v_C is, and if P' is not in its
> affinity, we've forced P' to reschedule for nothing.
> Also, there now might be another (or even a number of) CPU where v_C
> could run that stays idle, as it has not being tickled.

Yes -- the two clauses look a bit like they were conceived 
independently, and maybe no one thought about how they might interact.

> So, if that is true, it seems we leave some room for sub-optimal CPU
> utilization, as well as some non-work conserving windows.
> Of course, it is very hard to tell how frequent this actually happens.
>
> As it comes to possible solution, I think that, for instance, tickling
> all the CPUs in both v_W's and v_C's affinity masks could solve this,
> but that would also potentially increase the overhead (by asking _a_lot_
> of CPUs to reschedule), and again, it's hard to say if/when it's
> worth...

Well in my code, opt_tickle_idle_one is on by default, which means only 
one other cpu will be woken up.  If there were an easy way to make it 
wake up a CPU in v_C's affinity as well (supposing that there was no 
overlap), that would probably be a win.

Of course, that's only necessary if:
* v_C is lower priority than v_W
* There are no idlers that intersect both v_C and v_W's affinity mask.

It's probably a good idea though to try to set up a scenario where this 
might be an issue and see how often it actually happens.

  -George

next prev parent reply	other threads:[~2012-10-23 15:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-23 13:34 About vcpu wakeup and runq tickling in credit Dario Faggioli
2012-10-23 15:16 ` George Dunlap [this message]
2012-10-24 16:48   ` Dario Faggioli
2012-11-15 12:10   ` Dario Faggioli
2012-11-15 12:18     ` George Dunlap
2012-11-15 15:50       ` Dario Faggioli
2012-11-16 10:53       ` Dario Faggioli
2012-11-16 12:00         ` Dario Faggioli
2012-11-16 15:44           ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5086B4DF.6060701@eu.citrix.com \
    --to=george.dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir.xen@gmail.com \
    --cc=raistlin@linux.it \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.