xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xenproject.org
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Anshul Makkar <anshul.makkar@citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: [PATCH v2 03/10] xen: credit2: make tickling more deterministic
Date: Fri, 30 Sep 2016 04:53:39 +0200	[thread overview]
Message-ID: <147520401933.22544.13337692634316589872.stgit@Solace.fritz.box> (raw)
In-Reply-To: <147520253247.22544.10673844222866363947.stgit@Solace.fritz.box>

Right now, the following scenario can occurr:
 - upon vcpu v wakeup, v itself is put in the runqueue,
   and pcpu X is tickled;
 - pcpu Y schedules (for whatever reason), sees v in
   the runqueue and picks it up.

This may seem ok (or even a good thing), but it's not.
In fact, if runq_tickle() decided X is where v should
run, it did it for a reason (load distribution, SMT
support, cache hotness, affinity, etc), and we really
should try as hard as possible to stick to that.

Of course, we can't be too strict, or we risk leaving
vcpus in the runqueue while there is available CPU
capacity. So, we only leave v in runqueue --for X to
pick it up-- if we see that X has been tickled and
has not scheduled yet, i.e., it will have a real chance
of actually select and schedule v.

If that is not the case, we schedule it on Y (or, at
least, we consider that), as running somewhere non-ideal
is better than not running at all.

The commit also adds performance counters for each of
the possible situations.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes from v1:
 * always initialize tickled_cpu to -1, also for idle vcpus (in which cases, it
   just won't ever change to anything else than that), for improved readability
   and understandability;
 * logic for reporting back to csched_schedule() whether any vcpu was skipped,
   within runq_candidate(), and to only reset the credits if that did not
   happen moved out from here, to another patch.
---
 xen/common/sched_credit2.c   |   37 ++++++++++++++++++++++++++++++++++++-
 xen/include/xen/perfc_defn.h |    3 +++
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 5c7d0dc..3986441 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -54,6 +54,7 @@
 #define TRC_CSCHED2_LOAD_CHECK       TRC_SCHED_CLASS_EVT(CSCHED2, 16)
 #define TRC_CSCHED2_LOAD_BALANCE     TRC_SCHED_CLASS_EVT(CSCHED2, 17)
 #define TRC_CSCHED2_PICKED_CPU       TRC_SCHED_CLASS_EVT(CSCHED2, 19)
+#define TRC_CSCHED2_RUNQ_CANDIDATE   TRC_SCHED_CLASS_EVT(CSCHED2, 20)
 
 /*
  * WARNING: This is still in an experimental phase.  Status and work can be found at the
@@ -398,6 +399,7 @@ struct csched2_vcpu {
     int credit;
     s_time_t start_time; /* When we were scheduled (used for credit) */
     unsigned flags;      /* 16 bits doesn't seem to play well with clear_bit() */
+    int tickled_cpu;     /* cpu tickled for picking us up (-1 if none) */
 
     /* Individual contribution to load */
     s_time_t load_last_update;  /* Last time average was updated */
@@ -1049,6 +1051,10 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now)
     __cpumask_set_cpu(ipid, &rqd->tickled);
     smt_idle_mask_clear(ipid, &rqd->smt_idle);
     cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ);
+
+    if ( unlikely(new->tickled_cpu != -1) )
+        SCHED_STAT_CRANK(tickled_cpu_overwritten);
+    new->tickled_cpu = ipid;
 }
 
 /*
@@ -1276,6 +1282,7 @@ csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
         svc->credit = CSCHED2_IDLE_CREDIT;
         svc->weight = 0;
     }
+    svc->tickled_cpu = -1;
 
     SCHED_STAT_CRANK(vcpu_alloc);
 
@@ -2268,6 +2275,17 @@ runq_candidate(struct csched2_runqueue_data *rqd,
         if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) )
             continue;
 
+        /*
+         * If a vcpu is meant to be picked up by another processor, and such
+         * processor has not scheduled yet, leave it in the runqueue for him.
+         */
+        if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu &&
+             cpumask_test_cpu(svc->tickled_cpu, &rqd->tickled) )
+        {
+            SCHED_STAT_CRANK(deferred_to_tickled_cpu);
+            continue;
+        }
+
         /* If this is on a different processor, don't pull it unless
          * its credit is at least CSCHED2_MIGRATE_RESIST higher. */
         if ( svc->vcpu->processor != cpu
@@ -2284,9 +2302,25 @@ runq_candidate(struct csched2_runqueue_data *rqd,
 
         /* In any case, if we got this far, break. */
         break;
+    }
 
+    if ( unlikely(tb_init_done) )
+    {
+        struct {
+            unsigned vcpu:16, dom:16;
+            unsigned tickled_cpu;
+        } d;
+        d.dom = snext->vcpu->domain->domain_id;
+        d.vcpu = snext->vcpu->vcpu_id;
+        d.tickled_cpu = snext->tickled_cpu;
+        __trace_var(TRC_CSCHED2_RUNQ_CANDIDATE, 1,
+                    sizeof(d),
+                    (unsigned char *)&d);
     }
 
+    if ( unlikely(snext->tickled_cpu != -1 && snext->tickled_cpu != cpu) )
+        SCHED_STAT_CRANK(tickled_cpu_overridden);
+
     return snext;
 }
 
@@ -2351,7 +2385,7 @@ csched2_schedule(
         snext = CSCHED2_VCPU(idle_vcpu[cpu]);
     }
     else
-        snext=runq_candidate(rqd, scurr, cpu, now);
+        snext = runq_candidate(rqd, scurr, cpu, now);
 
     /* If switching from a non-idle runnable vcpu, put it
      * back on the runqueue. */
@@ -2390,6 +2424,7 @@ csched2_schedule(
         }
 
         snext->start_time = now;
+        snext->tickled_cpu = -1;
 
         /* Safe because lock for old processor is held */
         if ( snext->vcpu->processor != cpu )
diff --git a/xen/include/xen/perfc_defn.h b/xen/include/xen/perfc_defn.h
index a336c71..4a835b8 100644
--- a/xen/include/xen/perfc_defn.h
+++ b/xen/include/xen/perfc_defn.h
@@ -66,6 +66,9 @@ PERFCOUNTER(runtime_max_timer,      "csched2: runtime_max_timer")
 PERFCOUNTER(migrated,               "csched2: migrated")
 PERFCOUNTER(migrate_resisted,       "csched2: migrate_resisted")
 PERFCOUNTER(credit_reset,           "csched2: credit_reset")
+PERFCOUNTER(deferred_to_tickled_cpu,"csched2: deferred_to_tickled_cpu")
+PERFCOUNTER(tickled_cpu_overwritten,"csched2: tickled_cpu_overwritten")
+PERFCOUNTER(tickled_cpu_overridden, "csched2: tickled_cpu_overridden")
 
 PERFCOUNTER(need_flush_tlb_flush,   "PG_need_flush tlb flushes")
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-09-30  2:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-30  2:53 [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2! Dario Faggioli
2016-09-30  2:53 ` [PATCH v2 01/10] xen: credit1: return the 'time remaining to the limit' as next timeslice Dario Faggioli
2016-09-30 11:16   ` George Dunlap
2016-09-30  2:53 ` [PATCH v2 02/10] xen: credit1: don't rate limit context switches in case of yields Dario Faggioli
2016-09-30 11:18   ` George Dunlap
2016-09-30  2:53 ` Dario Faggioli [this message]
2016-09-30 11:25   ` [PATCH v2 03/10] xen: credit2: make tickling more deterministic George Dunlap
2016-09-30  2:53 ` [PATCH v2 04/10] xen: credit2: only reset credit on reset condition Dario Faggioli
2016-09-30 11:28   ` George Dunlap
2016-09-30 12:25   ` anshul makkar
2016-09-30 12:57     ` Dario Faggioli
2016-09-30  2:53 ` [PATCH v2 05/10] xen: credit2: implement yield() Dario Faggioli
2016-09-30 12:52   ` George Dunlap
2016-09-30 14:01     ` Dario Faggioli
2016-09-30  2:54 ` [PATCH v2 06/10] xen: tracing: add trace records for schedule and rate-limiting Dario Faggioli
2016-09-30 13:16   ` George Dunlap
2016-10-01  0:18   ` Meng Xu
2016-09-30  2:54 ` [PATCH v2 07/10] tools: tracing: handle more scheduling related events Dario Faggioli
2016-09-30 10:22   ` Ian Jackson
2016-09-30  2:54 ` [PATCH v2 08/10] libxl: fix coding style of credit1 parameters related functions Dario Faggioli
2016-09-30 10:24   ` Ian Jackson
2016-09-30 12:04     ` Dario Faggioli
2016-09-30 13:25       ` George Dunlap
2016-09-30  2:54 ` [PATCH v2 09/10] libxl: allow to set the ratelimit value online for Credit2 Dario Faggioli
2016-09-30 10:30   ` Ian Jackson
2016-09-30 10:33     ` George Dunlap
2016-09-30 10:35       ` Ian Jackson
2016-09-30 12:37       ` Dario Faggioli
2016-09-30  2:54 ` [PATCH v2 10/10] xl: " Dario Faggioli
2016-09-30 10:34   ` Ian Jackson
2016-09-30 15:54     ` Dario Faggioli
2016-09-30 16:02       ` Ian Jackson
2016-10-13 22:19   ` Jim Fehlig
2016-10-14 11:31     ` George Dunlap
2016-09-30 13:51 ` [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2! George Dunlap
2016-09-30 14:06   ` Dario Faggioli
2016-09-30 14:10     ` George Dunlap
2016-09-30 14:12       ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=147520401933.22544.13337692634316589872.stgit@Solace.fritz.box \
    --to=dario.faggioli@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).